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(57) Abstract 

The present invention describes the isolation and sequence of genes from Flavobacterium heparinum encoding heparin and heparan 
sulfate degrading enzymes, heparinase II and heparinase III (EC 4.2.2.8). It further describes a method of expressing and an expression 
for heparinases I, II and III using a modified ribosome binding region derived from a promoter from glycosaminoglycan lyase genes 
of F. heparinum. Also, a multi-step protein purification method incorporating cell disruption, cation exchange chromatography, affinity 
chromatography and hydroxylapatite chromatography is outlined. Antibodies against a post-translational modification moiety common 
to Flavobacterium heparinum proteins and a method to obtain antibodies specific to these moieties and to the amino acid sequences of 
heparinases I, II and III arc described. 
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NUCLEIC ACID SEQUENCES AND EXPRESSION SYSTEMS FOR 
HEPARINASE II AND HEPARINASE III 
DERIVED FROM Flavobacterium heparinum 

5 

BACKGROUND OF THE INVENTION 

This invention is directed to cloning, sequencing and expressing 
heparinase II and heparinase III from Flavobacterium heparinum. 

The heparin and heparan sulfate family of molecules is comprised of 
1 0 glycosaminoglycans of repeating glucosamine and hexuronic acid residues, 
either iduronic or glucuronic, in which the 2, 3 or 6 position of glucosamine 
or the 2 position of the hexuronic acid may be sulfated. Variations in the 
extent and location of sulfation as well as conformation of the alternating 
hexuronic acid residue leads to a high degree of heterogeneity of the 

1 5 molecules within this family. Conventionally, heparin refers to molecules 

which possess a high sulfate content, 2.6 sulfates per disaccharide, and a 
higher amount of iduronic acid. Conversely, heparan sulfate contains lower 
amounts of sulfate, 0.7 to 1.3 sulfates per disaccharide, and less iduronic 
acid. However, variants of intermediate composition exist and heparins 

2 0 from all biological sources have not yet been characterized. 

Specific sulfation/glycosylation patterns of heparin have been 
associated with biological function, such as the antithrombin binding site 
described by Choay et al., Thrombosis Res. 18: 573-578 (1980), and the 
fibroblast growth factor binding site described by Turnbull et al., J \ BioL 
25 Chem.267: 10337-10341 (1992). It is apparent from these examples that 
heparin's interaction with certain molecules results from the conformation 
imparted by specific sequences and not solely due to electrostatic 
interactions imparted by its high sulfate composition. Heparin interacts 
with a variety of mammalian molecules, thereby modulating several 
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biological events such as hemostasis, cell proliferation, migration and 
adhesion as summarized by Kjellen and Lindahl, Ann Rev Biochem 60: 
443-475 (1991) and Burgess and Macaig, Ann. Rev. Biochem. 58: 575-606 
(1989). Heparin, extracted from bovine lungs and porcine intestines, has 
5 been used as an anticoagulant since its antithrombotic properties were 
discovered by McLean, Am. J. Physiol. 41: 250-257 (1916). Heparin and 
chemically modified heparins are continually under review for medical 
applications in the areas of wound healing and treating vascular disease. 
Heparin degrading enzymes, referred to as heparinases or heparin 
0 lyases, have been identified in several microorganisms including: 

Flavobacterium heparinum, Bacteriodes sp. and Aspergillus nidulans as 
summarized by Linhardt et al., Appl. Biochem. Biotechnol. 12: 135-177 
(1986). Heparan sulfate degrading enzymes, referred to as heparitinases 
or heparan sulfate lyases, have been detected in platelets (Oldberg et al, 
5 Biochemistry 19: 5755-5762 (1980)), tumor (Nakajima et al.. J. Biol. Chem. 
259: 2283-2290 (1984)) and endothelial cells (Gaal et al., Biochem. 
Biophys. Res. Comm. 161: 604-614 (1989)). Mammalian heparanases 
catalyze the hydrolysis of the carbohydrate backbone of heparan sulfate at 
the hexuronic acid (1 -> 4) glucosamine linkage (Nakajima et al.,J. Cell. 
> Biochem. 36: 157-167 (1988)) and are inhibited by the highly sulfated 
heparin. However, accurate biochemical characterizations of these 
enzymes has thus far been prevented by the lack of a method to obtain 
homogeneous preparations of the molecules. 

Flavobacterium heparinum produces heparin and heparan sulfate 
degrading enzymes termed heparinase I (E.C. 4.2.2.7) as described by Yang 
et al., J. Biol. Chem. 260(3): 1849-1857 (1985), heparinase II as described 
by Zimmermann and Cooney, U.S. Patent No. 5,169,772, and heparinase III 
(E.C. 4.2.2.8) as described by Lohse and Linhardt, J. Biol. Chem. 267: 24347- 
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24355 (1992). These enzymes catalyze an eliminative cleavage of the 
(<xl -+ 4) carbohydrate bond between glucosamine and hexuronic acid 
residues in the heparin/heparan sulfate backbone. The three enzyme 
variants differ in their action on specific carbohydrate residues. 
Heparinase I cleaves at a-D-GlcNp2S6S(l -> 4)a-L-IdoAp2S, heparinase 
IE at a-D-GlcNp2Ac(or2S)60H(l -* 4)p-D-GlcAp and heparinase II at 
either linkage as described by Desai et al.. Arch. Biochem. Biophys. 
306(2): 461-468 (1993). Secondary cleavage sites for each enzyme also 
have been described by Desai et al. 

Heparinase I has been used clinically to neutralize the 
anticoagulant properties of heparin as summarized by Baugh and 
Zimmermann, Perfusion Rev. 1(2): 8-13, 1993. Heparinase I and III 
have been shown to modulate cell-growth factor interactions as 
demonstrated by Bashkin et al., J. Cell Physiol. 757:126-137 (1992) and 
cell-lipoprotein interactions as demonstrated by Chappell et al., J. Biol. 
Chem. 2<5S(79;:14168-14175 (1993). The availability of heparin 
degrading enzymes of sufficient purity and quantity could lead to the 
development of important diagnostic and therapeutic formulations. 

SUMMARY OF THE INVENTION 
Prior to the present invention, partially purified heparinases II 

and III were available, but their amino acid sequences were unknown. 

Cloning these enzymes was difficult because of toxicity to the host cells. 

The present inventors were able to clone the genes for heparinases II 

and III, and herein provide their nucleotide and amino acid sequences. 
A method is described for the isolation of highly purified heparin 

and heparan sulfate degrading enzymes from F. heparinum. 
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Characterization of each protein demonstrated that heparinases I, II and 
III are glycoproteins. All three proteins are modified at their N- 
terminal amino acid residue. Antibodies generated by injecting purified 
heparinases into rabbits yielded anti-sera which demonstrated a high 
degree of cross reactivity to proteins from F. heparinum. Polyclonal 
antibodies were separated by affinity chromatography into fractions 
which bind the amino acid portion of the proteins and a fraction which 
binds the post-translational modification allowing for the use of these 
antibodies to specifically distinguish the native and recombinant forms 
of each heparinase protein. 

Amino acid sequence information was used to synthesize 
oligonucleotides that were subsequently used in a polymerase chain 
reaction (PCR) to amplify a portion of the heparinase II and heparinase 
III genes. Amplified regions were used in an attempt to identify clones 
from a XDASH-II gene library which contained F. heparinum genomic 
DNA. Natural selection against clones containing the entire heparinase 
II and III genes was observed. This was circumvented by cloning 
sections of the heparinase II gene separately, and by screening host 
strains for stable maintenance of complete heparinase III clones. 
Expression of heparinase II and III was achieved by use of a vector 
containing a modified ribosome binding site which was shown to 
increase the expression of heparinase I to significant levels. 

This patent describes the gene and amino acid sequences for 
heparinase II and III from F. heparinum, which may be used in 
conjunction with suitable expression systems to produce the enzymes. 
Also described, is a modified ribosome binding sequence used to 
express heparinase I, II, and III. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows the modifications to the tac promoter ribosome 
binding region, which were evaluated for the level of expression of 
heparinase I. The original sequence, as found in pBhep, and the modified 
5 sequences, as found in pGhep and pA4hep f are shown with the Shine- 
Dalgarno sequences (S-D) and the heparinase I gene start codon, 
underlined. The gap (in nucleotides, nt) between these regions is indicated 
below each sequence. The ribosome binding region for pGB contains no 
start codon, and has a BamHI site (underlined) in place of the EcoRl site 
1 0 (GAATTC) found in pGhep. 

Figure 2 shows the construction of plasmids used to sequence the 
heparinase II gene from Flavobacterium heparinum. Restriction sites are: 
N- Noth Nc = Ncoh S = Sail, B = BamHI, P = Pstl f E = EcoRl, H = Hindllh C = 
Clal and K = Kpnl. 

1 5 Figure 3 shows the construction of pGBH2, a plasmid capable of 

directing the expression of active heparinase II in E. coli from tandem tac 
promoters (double arrow heads). Restriction sites are: B = BamHI, P = Pst I. 

Figure 4 shows the nucleic acid sequence for the heparinase II gene 
from Flavobacterium heparinum (SEQU ID NO:l). 

2 0 Figure 5 shows the amino acid sequence for heparinase II from 

Flavobacterium heparinum (SEQU ID NO:2). The leader peptide sequence is 
underlined. The mature protein starts at Q-26. Peptides 2A, 2B and 2C are 
indicated at their corresponding positions within the protein. 

Figure 6 shows the construction of plasmids used to sequence the 
2 5 heparinase III gene from Flavobacterium heparinum. Restriction sites are: 
S = Sail, B = BamHl P = Pstl, E = EcoRl H = Hindlll, C = Clal and K = Kpnl. 
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Figure 7 shows the construction of pGBH3, a plasmid capable of 
directing the expression of active heparinase III in E. coli from a tandem 
taq promoter (double arrow heads). Restriction sites are: S = Sail, B = 
BamHl, P = Pstl E = EcoRl, H = HindUl, Bs = BspEl, C = Clal and K = Kpnl. 

Figure 8 shows the nucleic acid sequence for the heparinase III gene 
from Flavobacterium heparinum (SEQU ID NO:3). 

Figure 9 shows the amino acid sequence for heparinase III from 
Flavobacterium heparinum (SEQU ID NO:4). The leader peptide sequence is 
underlined. The mature protein starts at Q-25. Peptides 3A, 3B and 3C are 
indicated at their corresponding positions within the protein. 

DETAILED DESCRIPTION OF THE INVENTION 

To aid in the understanding of the specification and claims, including 
the scope to be given such terms, the following definitions are provided. 

£ejie_. By the term "gene" is intended a DNA sequence which encodes 
through its template or messenger RNA a sequence of amino acids 
characteristic of a specific peptide. Further, the term includes intervening, 
non-coding regions, as well as regulatory regions, and can include 5' and 3' 
ends. 

Qene sequence - The term "gene sequence" is intended to refer 
generally to a DNA molecule which contains one or more genes, or gene 
fragments, as well as a DNA molecule which contains a non-transcribed or 
non-translated sequence. The term is further intended to include any 
combination of gene(s), gene fragments(s), non-transcribed sequence(s) or 
non-translated sequence(s) which are present on the same DNA molecule. 

The present sequences may be derived from a variety of sources 
including DNA, synthetic DNA, RNA, or combinations thereof. Such gene 
sequences may comprise genomic DNA which may or may not include 
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naturally occurring introns. moreover, such genomic DNA may be obtained 
in association with promoter regions or poly A sequences. The gene 
sequences, genomic DNA or cDNA may be obtained in any of several ways. 
Genomic DNA can be extracted and purified from suitable cells, such as 
brain cells, by means well known in the art. Alternatively, mRNA can be 
isolated from a cell and used to produce cDNA by reverse transcription or 
other means. 

Recombinant DNA. By the term "recombinant DNA" is meant a 
molecule that has been recombined by in vitro splicing cDNA or a genomic 
DNA sequence. 

Cloning Vehicle. A plasmid or phage DNA or other DNA sequence 
which is able to replicate in a host cell. The cloning vehicle is characterized 
by one or more endonuclease recognition sites at which is DNA sequences 
may be cut in a determinable fashion without loss of an essential biological 
function of the DNA, which may contain a marker suitable for use in the 
identification of transformed cells. Markers include for example, 
tetracycline resistance or ampicillin resistance. The word vector can be 
used to connote a cloning vehicle. 

Expression Control Seouenee A sequence of nucleotides that controls 
or regulates expression of structural genes when operably linked to those 
genes. They include the lac systems, the trp system major operator and 
promoter regions of the phage lambda, the control region of fd coat protein 
and other sequences known to control the expression of genes in 
prokaryotic or eukaryotic cells. 

Expression vehicle, A vehicle or vector similar to a cloning vehicle 
but which is capable of expressing a gene which has been cloned into it, 
after transformation into a host. The cloned gene is usually placed under 
the control of (i.e., operable linked to) certain control sequences such as 
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promoter sequences. Expression control sequences will vary depending on 
whether the vector is designed to express the operably linked gene in a 
prokaryotic or eukaryotic host and may additionally contain 
transcriptional elements such as enhancer elements, termination 
sequences, tissue-specificity elements, and/or translational initiation and 
termination sites. 

Promoter. The term "promoter" is intended to refer to a DNA 
sequence which can be recognized by an RNA polymerase. The presence of 
such a sequence permits the RNA polymerase to bind and initiate 
transcription of operably linked gene sequences. 

Promoter region . The term "promoter region" is intended to broadly 
include both the promoter sequence as well as gene sequences which may 
be necessary for the initiation of transcription. The presence of a promoter 
region is, therefore, sufficient to cause the expression of an operably linked 
gene sequence. 

Operably Linked . As used herein, the term "operably linked" means 
that the promoter controls the initiation of expression of the gene. A 
promoter is operably linked to a sequence of proximal DNA if upon 
introduction into a host cell the promoter determines the transcription of 
the proximal DNA sequence or sequences into one or more species of RNA. 
A promoter is operably linked to a DNA sequence if the promoter is 
capable if initiating transcription of that DNA sequence. 

Prokaryote. The term "prokaryote" is meant to include all organisms 
without a true nucleus, including bacteria. 

Ho_sj.. The term "host" is meant to include not only prokaryotes, but 
also such eukaryotes as yeast and filamentous fungi, as well as plant and 
animal cells. The terms includes organisms or cell that is the recipient of a 
replicable expression vehicle. 
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The present invention is based on the cloning and expression of two 
previously uncloned enzymes. Although heparinases II and III had been 
partially purified previously, no amino acid sequences were available. 
Specifically, the invention discloses the cloning, sequencing and expression 
5 of heparinases II and III from Flavobacterium heparinum and the use of a 
modified ribosome binding region for expression of these genes. In 
addition to the nucleotide sequences, the amino acid sequences of 
heparinases II and II are also provided. The invention further provides 
expressed heparinases I, II and III, as well as methods of expressing those 
10 enzymes. 

Cloning was accomplished using degenerate and "guessmer" 
nucleotide primers derived from amino acid sequences of fragments of the 
heparinases, purified as described below in detail. The amino acid 
sequences were previously unavailable. Cloning was exceptionally difficult 

1 5 because of the unexpected problem of F. heparinum DNA toxicity in E. coli. 

The inventors discovered techniques for solving this problem, as described 
below in detail. Based on this disclosure, one skilled in the art can readily 
clone additional heparinases and other proteins from F. heparinum or from 
additional sources using the novel methods described within. 

2 0 Expression of the heparinases is a further disclosure of the present 

invention. To express heparinases I, II and III, transcriptional and 
translational signals recognizable by an appropriate host are necessary. 
The cloned heparinases encoding sequences, obtained through the methods 
described above, and preferably in a double-stranded form, may be 
2 5 operably linked to sequences controlling transcriptional expression in an 
expression vector, and introduced into a host cell, either prokaryote or 
eukaryote, to produce recombinant heparinases or a functional derivative 
thereof. Depending upon which strand of the heparinases encoding 



WO.OS/34630 



i*cr/i>«»«/o?*»i 



25 



1 0 

sequence is operably linked to the sequences controlling transcriptional 
expression, it is also possible to express heparinases antisense RNA or a 
functional derivative thereof. 

For the expression of heparinases I, II and III in E. coli, vectors were 
5 constructed wherein expression was driven by two repeats of the tac 
promoter. Modifications of the ribosome binding region of this promoter 
were made by introducing mutations with the polymerase chain reaction. 
In a preferred modification of the expression vector, the minimal 
consensus Shine-Delgarno sequence was improved by introducing a single 
1 0 mutation (AGGAA -> AGGAG), which had the further advantage of 
decreasing the number of nucleotides between the Shine-Delgarno 
sequence and the ATG start codon. Further modifications were produced 
using PCR in which the gap between the Shine-Delgarno sequence and the 
start codon were further reduced. Using the same techniques, additional 
modifications in this region, including insertions and deletions, can be 
produced to create additional heparinase expression vectors. As a result, 
an expression vector for the expression of heparinases is provided which 
comprises a modified ribosome binding region containing a 5 base pair 
Shine-Dalgarno sequence, a 9 base pair spacer region between the Shine- 
Dalgarno sequence and the ATG start codon, and a recombinant nucleotide 
sequence encoding. Also provided are modifications to this vector 
comprising changing the length and sequence of the Shine-Dalgarno 
sequence, and also by reducing the spacing between the Shine-Dalgarno 
sequence and the start codon to 8, 7, 6, 5, 4 or fewer nucleotides. Methods 
of expressing the heparinases using these novel expression vectors 
comprise a preferred embodiment of the invention. 

Expression of the heparinases in different hosts may result in 
different post-translational modifications which may alter the properties 
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of the heparinases, or a functional derivative thereof, in eukaryotic cells, 
and especially mammalian, insect and yeast cells. Especially preferred 
eukaryotic hosts are mammalian cells either in vivo, in animals or in tissue 
culture. Mammalian cells provide post-translational modifications to 
recombinant heparinases which include folding and/or glycosylation at 
sites similar or identical to that found for the native heparinases. Most 
preferably, mammalian host cells include brain and neuroblastoma cells. 

A nucleic acid molecule, such as DNA, is said to be "capable of 
expressing" a polypeptide if it contains expression control sequences which 
contain transcriptional regulatory information and such sequences are 
"operably linked" to the nucleotide sequence which encodes the 
polypeptide. 

An operable linkage is a linkage in which a sequence is connected to 
a regulatory sequence (or sequences) in such a way as to place expression 
of the sequence under the influence or control of the regulatory sequence. 
Two DNA sequences (such as a heparinases encoding sequence and a 
promoter region sequence linked to the 5' end of the encoding sequence) 
are said to be operably linked if induction of promoter function results in 
the transcription of the heparinases encoding sequence mRNA and if the 
nature of the linkage between the two DNA sequences does not (1) result 
in the introduction of a frame-shift mutation, (2) interfere with the ability 
of the expression regulatory sequences to direct the expression of the 
heparinases, or (3) interfere with the ability of the heparinases template to 
be transcribed by the promoter region sequence. Thus, a promoter region 
would be operably linked to a DNA sequence if the promoter were capable 
of effecting transcription of that DNA sequence. 

The precise nature of the regulatory regions needed for gene 
expression may vary between species or cell types, but in general includes. 
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as necessary, 5' non-transcribing and 5' non-translating (non-coding) 
sequences involved with initiation of transcription and translation 
respectively, such as the TATA box, capping sequence, CAAT sequence, and 
the like. Especially, such 5' non-transcribing control sequences will include 
5 a region which contains a promoter for transcriptional control of the 
operably linked gene. 

If desired, a fusion product of the heparinases may be constructed. 
For example, the sequence coding for heparinases may be linked to a signal 
sequence which will allow secretion of the protein from, or the 
1 0 compartmentalization of the protein in, a particular host. Such signal 
sequences maybe designed with or without specific protease sites such 
that the signal peptide sequence is amenable to subsequent removal. 
Alternatively, the native signal sequence for this protein may be used. 
Transcriptional initiation regulatory signals can be selected which 

1 5 allow for repression or activation, so that expression of the operably linked 

genes can be modulated. 

Based on this disclosure, one skilled in the art can readily place the 
sequences of the present invention in additional expression vectors and 
transform into a variety of bacteria to obtain recombinant heparinase II or 

2 0 heparinase III. 

Once the vector or DNA sequence containing the construct(s) is 
prepared for expression, the DNA construct(s) is introduced into an 
appropriate host cell by any if a variety of suitable means, including 
transfection. After the introduction of the vector, recipient cells are grown 
2 5 in a selective medium, which selects for the growth of vector-containing 
cells. Expression of the cloned gene sequence(s) results in the production 
of heparinase I, II or III, or in the production of a fragment of one of these 
proteins. This expression can take place in a continuous manner in the 
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transformed cells, or in a controlled manner, for example, expression which 
follows induction of differentiation of the transformed cells (for example, 
by administration of bromodeoxyuracil to neuroblastoma cells or the like). 
The expressed protein is isolated and purified in accordance with 
5 conventional conditions, such as extraction, precipitation, chromatography, 
electrophoresis, or the like. Detailed procedures for the isolation of the 
heparinases is discussed in detail in the examples below. 

The invention further provides functional derivatives of the 
sequences of heparinase II, heparinase III, and the modified ribosome 
1 0 binding site. As used herein, the term "functional derivative" is used to 
define any DNA sequence which is derived by the original DNA sequence 
and which still possesses the biological activities of the native parent 
molecule. A functional derivative can be an insertion, a deletion, or a 
substitution of one or more bases in the original DNA sequence. The 

1 5 substitutions can be such that they replace a native amino acid with 

another amino acid that does not substantially effect the functioning of the 
protein. Those skilled in the art will recognize that likely substitutions 
include positively the functioning of the protein, such as a small, neutrally 
charged amino acid replacing another small, neutrally charged amino acid. 

2 0 Those of skill in the art will recognize that functional derivatives of the 

heparinases can be prepared by mutagenesis of the DNA using one of the 
procedures known in the art, such as site-directed mutagenesis. In 
addition, random mutagenesis can be conducted and mutants retaining 
function can be obtained through appropriate screening. 
2 5 The antibodies of the present invention include monoclonal and 

polyclonal antibodies, as well fragments of these antibodies. Fragments of 
the antibodies of the present invention include, but are not limited to, the 
Fab, the Fab2, and the Fc fragment. 



<;uR<mnjTF sheet (rule 2&) 



1 4 

The invention also provides hybridomas which are capable of 
producing the above-described antibodies. A hybridoma is an 
immortalized cell line which is capable of secreting a specific monoclonal 
antibody. 

5 In general, techniques for preparing polyclonal and monoclonal 

antibodies as well as hybridomas capable of producing the desired 
antibody are well-known in the art (Campbell, A.M., "Monoclonal Antibody 
Technology: Laboratory Techniques in Biochemistry and Molecular 
Biology," Elsevier Science Publishers, Amsterdam, The Netherlands (1984); 
0 St. Groth et al., J. Immunol. Methods 35: 1-21 (1980)). 

Any mammal which is known to produce antibodies can be 
immunized with the pseudogene polypeptide. Methods for immunization 
are well-known in the art. Such methods include subcutaneous or 
interperitoneal injection of the polypeptide. One skilled in the art will 
5 recognize that the amount of heparinase used for immunization will vary 
based on the animal which is immunized, the antigenicity of the peptide 
and the site of injection. 

The protein which is used as an immunogen may be modified or 
administered in an adjuvant in order to increase the protein's antigenicity. 
Methods of increasing the antigenicity of a protein are well-known in the 
an and include, but are not limited to coupling the antigen with a 
heterologous protein (such as globulin or p-galactosidase) or through the 
inclusion of an adjuvant during immunization. 

For monoclonal antibodies, spleen cells from the immunized animals 
are removed, fused with myeloma cells, such as SP2/0-Agl4 myeloma 
cells, and allowed to become monoclonal antibody producing hybridoma 
cells. 
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Any one of a number of methods well known in the an can be used 
to identify the hybridoma cell which produces an antibody with the 
desired characteristics. These include screening the hybridomas with an 
ELISA assay, western blot analysis, or radioimmunoassay (Lutz et al., Exp. 
5 Cell Res. 775:109-124 (1988)). 

Hybridomas secreting the desired antibodies are cloned and the class 
and subclass is determined using procedures known in the art (Campbell, 
A.M., Monoclonal Antibody Technology: Laboratory Techniques in 
Biochemistry and Molecular Biology, Elsevier Science Publishers, 
10 Amsterdam, The Netherlands (1984)). 

For polyclonal antibodies, antibody containing antisera is isolated 
from the immunized animal and is screened for the presence of antibodies 
with the desired specificity using one of the above-described procedures. 

The present invention further provides the above-described 

1 5 antibodies in detectably labelled form. Antibodies can be detectably 

labelled through the use of radioisotopes, affinity labels (such as biotin, 
avidin, etc.), enzymatic labels (such as horseradish peroxidase, alkaline 
phosphatase, etc.), fluorescent labels (such as FITC or rhodamine, etc.), 
paramagnetic atoms, chemiluminescent labels, and the like. Procedures for 

2 0 accomplishing such labelling are well-known in the art; for example, see 

Sternberger, L.A. et aL, J. Histochem. Cytochem. 18:315 (1970); Byer, E.A. et 
aL.Meth. Enzym. 62:308 (1979); Engval, E. et aL. Immunol. 709:129 (1972); 
Goding, J.W., /. Immunol. Meth. 75:215 (1976). 

The present invention further provides the above-described 
2 5 antibodies immobilized on a solid support. Examples of such solid supports 
include plastics, such as polycarbonate, complex carbohydrates such as 
agarose and sepharose, acrylic resins such as polyacrylamide and latex 
beads. Techniques for coupling antibodies to such solid supports are well 
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known in the an (Weir et a I.. Handbook of Experimental Immunology, 4th 
Ed., Blackwell Scientific Publications. Oxford, England (1986)). The 
immobilized antibodies of the present invention can be used for 
immunoaffinity purification of heparinases. 

Having now generally described the mvention, the same will be 
understood by a series of specific examples, which are not intended to be 
limiting. 

EXAMPLE 1: Purification of Heparinases 

Heparin lyase enzymes were purified from cultures of 
Flavobacterium heparinum. F. heparinum was cultured in a 15 L 
computer-controlled fermenter using a variation of the defined nutrient 
medium described by Galliher et al., Appl Environ. Microbiol. 41(2):360- 
365 (1981). Those fermentations designed to produce heparin lyases 
5 incorporate semi-purified heparin (Celsus Laboratories) in the media at a 
concentration of 1.0 g/L as the inducer of heparinase synthesis. Cells were 
harvested by centrifugation and the desired enzymes released from the 
periplasmic space by a variation of the osmotic shock procedure described 
by Zimmermann and Cooncy, U.S. Patent No. 5,262,325, herein 
0 incorporated by reference. 

A semi-purified preparation of the heparinase enzymes was 
achieved by a modification of the procedure described by Zimmermann et 
ai, U.S. Patent No. 5,262,325. Proteins from the crude osmolate were 
adsorbed onto cation exchange resin (CBX, J.T. Baker) at a conductivity of 1 
- 7 umho. Unbound proteins from the extract were discarded and the 
resin packed into a chromatography column (5.0 cm i.d. x 100 cm). The 
bound proteins eluted at a linear flow rate of 3.75 cm-min-l with step 
gradients of 0.01 M phosphate. 0.01 M phosphate/0.1 M sodium chloride. 
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0.01 M phosphate/0.25 M sodium chloride and 0.01 M phosphate/1.0 M 
sodium chloride, all at pH 7.0 +/- O.l! Heparinase II elutes in the 0.1 M 
NaCl fraction, while heparinases 1 and 3 elute in the 0.25 M fraction. 

Alternately, the 0.1 M sodium chloride step was eliminated and the 
5 three heparinases co-eluted with 0.25 M sodium chloride. The heparinase 
fractions were loaded directly onto a column containing cellufine sulfate 
(5.0 cm i.d. x 30 cm, Amicon) and eluted at a linear flow rate of 2.50 
cnrmin-1 with step gradients of 0.01 M phosphate, 0.01 M phosphate/0.2 
M sodium chloride, 0.01 M phosphate/0.4 M sodium chloride and 0.01 M 
10 phosphate/1.0 M sodium chloride, all at pH 7.0 +/- 0.1. Heparinase II and 
3 elute in the 0.2 M sodium chloride fraction while heparinase I elutes in 
the 0.4 M fraction. 

The 0.2 M sodium chloride fraction from the cellufine sulfate column 
was diluted with 0.01 M sodium phosphate to give a conductance of less 

1 5 than 5 umhos. The solution was further purified by loading the material 

onto a hydroxylapatite column (2.6 cm i.d. x 20 cm) and eluting the bound 
protein at a linear flow rate of 1.0 cnvmin-1 with step gradients of 0.01 M 
phosphate, 0.01 M phosphate/0.35 M sodium chloride, 0.01 M 
phosphate/0.45 M sodium chloride, 0.01 M phosphate/0.65 M sodium 

2 0 chloride and 0.01 M phosphate/1.0 M sodium chloride, all at pH 7.0 +/- 0.1. 

Heparinase III elutes in a single protein peak in the 0.45 M sodium 
chloride fraction while heparinase III elutes in a single protein peak in the 
0.65 M sodium chloride fraction. 

Heparinase I was further purified by loading material from the 
2 5 cellufine sulfate column, diluted to a conductivity less than 5 |amhos, onto 
a hydroxylapatite column (2.6 cm i.d. x 20 cm) and eluting the bound 
protein at a linear flow rate of 1.0 cnrmin- 1 with a linear gradient of 
phosphate (0.01 to 0.25 M) and sodium chloride (0.0 to 0.5 M). Heparinase 
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I elutes in a single protein peak approximately mid-way through the 
gradient. 

The heparinase enzymes obtained by this method were analyzed by 
SDS-PAGE using the technique of Laemmli, Nature 227: 680-685 (1970), 
and the gels quantified by a scanning densitometer (Bio-Rad, Model GS- 
670). Heparinases I, II and III displayed molecular weights of 42,500+/- 
2,000, 84,000+/-4,200 and 73,OO0+/-3,500 Daltons. respectively. All 
proteins displayed purities of greater than 99 %. Purification results for 
the heparinase enzymes are shown in Table 1. 

Heparinase activities were determined by the spectrophotometric 
assay described by Yang et al. A modification of this assay incorporating a 
reaction buffer comprised of 0.018 M Tris, 0.044 M sodium chloride and 
1.5 g/L heparan sulfate at pH 7.5 was used to measure heparan sulfate 
degrading activity. 

Recombinant heparinase I forms intracellular inclusion bodies which 
require denaturation and protein refolding to obtain active heparinase. 
Two solvents, urea and guanidine hydrochloride, were examined as 
solubilizing agents. Of these, only guanidine HC1, at 6 M, was able to 
solubilize the heparinase 1 inclusion bodies. However, the highest degree 
of purification was obtained by sequentially washing the inclusion bodies 
in 3 M urea and 6 M guanidine HC1. The urea wash step served to removed 
contaminating E. coli proteins and cell debris prior to solubilizing of the 
aggregated heparinase I by guanidine HC1. 

Recombinant heparinase I was prepared by growing E. coli 
Y1090(pGHepl), a strain harboring a plasmid containing the heparinase I 
gene expressed from tandem tac promoters, in Luria broth with 0.1 M 
IPTG. The cells were concentrated by centrifugation and resuspended in 
1/1 0th volume buffer containing 0.01 M sodium phosphate and 0.2 M 
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sodium chloride at pH 7.0. The cells were disrupted by sonication, 5 
minutes with intermittent 30 second cycles, power setting #3 and the 
inclusion bodies concentrated by centrifugation, 7,000 x g, 5 minutes. The 
pellets were washed two times with cold 3 M urea for 2 hours at pH, 7.0 
5 and the insoluble material recovered by centrifugation. Heparinase I was 
unfolded in 6 M guanidine HC1 containing 50 mM DTT and refolded by 
dialysis into 0.1 M ammonium sulfate. Additional contaminating proteins 
precipitated in the 0.1 M ammonium sulfate and could be removed by 
centrifugation. Heparinase I purified by this method had a specific activity 
1 0 of 42.21 IU/mg and was 90 % pure by SDS-PAGE/ scanning densitometry 
analysis. The enzyme can be further purified by cation exchange 
chromatography, as described above, yielding a heparinase I preparation 
that is more than 99 % pure by SDS-PAGE/ scanning densitometry analysis. 

1 5 EXAMPLE 2: Characterization of Heparinases 

The molecular weight and kinetic properties of the three heparinase 
enzymes have been accurately reported by Lohse and Linhardt, J. Biol. 
Chem. 267:24347-24355 (1992). However, an accurate characterization of 
the proteins' post-translational modifications had not been carried out. 

2 0 Heparinases I, II and III, purified as described herein, were analyzed for 

the presence of carbohydrate moieties. Solutions containing 2 ug of 
heparinases I, II and III and recombinant heparinase I were brought to pH 
5.7 by adding 0.2 M sodium acetate. These protein samples underwent 
carbohydrate biotinylation following protocol 2a, described in the 
2 5 GlycoTrack kit (Oxford Glycosystems). 30 \il of each biotinylated protein 
solution was subjected to SDS-PAGE (10% gel) and transferred by 
electroblotting at 170 mA constant current to a nitrocellulose membrane. 
Detection of the biotinylated carbohydrate was accomplished by an 
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alkaline phosphatase-specific color reaction after attachment of a 
streptavadin-alkaline phosphatase conjugate to the biotin groups. These 
analyses revealed that heparinases I and II are glycosylated and 
heparinase III and recombinant heparinase I are not. 

Polyclonal antibodies generated in rabbits injected with wild type 
heparinase I could be fractionated into two populations as described 
below. It appears that one of these fractions recognizes a post- 
translational moiety common to proteins made in F. heparinum, while the 
other fraction specifically recognizes amino acid sequences contained in 
heparinase I. All heparinase enzymes made in F. heparinum were 
recognized by the "non-specific" antibodies but not heparinase made in £. 
coli. The most likely candidate for the non-protein antigenic determinant 
from heparinase I is the carbohydrate component; thus, the Western blot 
experiment indicates that all lyases made in F. heparinum are glycosylated. 

Purified heparinases II and III were analyzed by the technique of 
Edman to determine the N-terminal amino acid residue of the mature 
protein. However, the Edman chemistry was unable to liberate an amino 
acid, indicating that a post-translational modification had occurred at the 
N-terminal amino acid of both heparinases. One nmol samples of 
heparinases II and III were used for deblocking with pyroglutamate 
aminopeptidase. Control samples were produced by mock deblocking 1 
nmol protein samples without adding pyroglutamate aminopeptidase. All 
samples were placed in 10 mM NH4CO3, pH 7.5, and 10 mM DTT (100 \i\ 
final volume). To non-control samples, 1 mU of pyroglutamate 
aminopeptidase was added and all samples were incubated for 8 hr at 37° 
C. After incubation, an additional 0.5 mU of pyroglutamate 
aminopeptidase was added to non-control samples and all samples were 
incubated for an additional 16 h at 37°C. 
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Deblocking buffers were exchanged for 35% formic acid using a 
10,000 Dalton cut-off Centricon unit and the sample was dried under 
vacuum. The samples were subjected to amino acid sequence analysis 
according to the method of Edman. 

The properties of the three heparinase proteins from F lavobacterium 
heparinum are listed in Table 2. 

Heparinases II and III were digested with cyanogen bromide in 
order to produce peptide fragments for isolation. The protein solutions (1- 
10 mg/ml protein concentration) were brought to a DTT concentration of 
0.1 M, and incubated at 40°C for 2 nr. The samples were frozen and 
lyophilized under vacuum. The pellet was resuspended in 70% formic acid, 
and nitrogen gas was bubbled through the solution to exclude oxygen. A 
stock solution of CNBr was made in 70% formic acid and the stock solution 
was bubbled with nitrogen gas and stored in the dark for short time 
periods. For addition of CNBr, a 500 to 1000 times molar excess of CNBr to 
methionine residues in the protein was used. The CNBr stock was added to 
the protein solutions, bubbled with nitrogen gas and the tube was sealed. 
The reaction tube was incubated at 24°C for 20 hr, in the dark. 

The samples were dried down partially under vacuum, water was 
added to the sample, and partial lyophilization was repeated. This washing 
procedure was repeated until the sample pellets were white. The peptide 
mixtures were solubilized in formic acid and applied to a Vydac Cis 
reverse phase HPLC column (4.6 mm i.d. x 30 cm) and individual peptide 
fragments eluted at a linear flow rate of 6.0 cnrmhv 1 with a linear 
gradient of 10 to 90 % acetonitrile in 1 % trifluoroacetic acid. Fragments 
recovered from these reactions were subjected to amino acid sequence 
determination using an Applied Biosystems 745A Protein Sequencer. Three 
peptides isolated from heparinase II gave sequences: EFPEMYNLAAGR 
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(SEQU ID NO:5). KPADIPEVKDGR (SEQU ID NO:6), and LAGDFVTGKILAQGFG 
PDNQTPDYTYL (SEQU ID NO:7) and were named peptides 2A, 2B and 2C 
respectively. Three peptides from heparinase III gave sequences: LIK- 
NEVRWQLHR VK (SEQU ID NO:8). VLKASPPGEFHAQPDNGTFELFI (SEQU ID 
NO:9) and KALVHWFWPHKGYGYFDYGKDIN (SEQU ID NO: 10) and were 
named peptides 3A, 3B and 3C, respectively. 

EXAMPLE 3: Antibodies to the Heparinase Proteins 

Heparinases I, II and III and recombinant heparinase I, purified as 
described herein, were used to generate polyclonal antibodies in rabbits. 
Each of heparinase I, II and III was carried through the following standard 
immunization procedure: The primary injection consisted of 0.5 - 1.0 rag of 
purified protein dissolved in 1 ml of sterile phosphate buffered Saline, 
which was homogenized with 1 ml of Freund's adjuvant (Cedarlane 
Laboratories Ltd.). This protein-adjuvant emulsion was used to inject New 
Zealand White female rabbits; 1 ml per rabbit, 0.5 ml per rear leg, i.m., in 
the thigh muscle near the hip. After 2 to 3 weeks, the rabbits were given 
an injection boost consisting of 0.5 - 1.0 mg of purified protein dissolved in 
sterile phosphate buffered Saline homogenized with 1 ml of incomplete 
Freund's adjuvant (Cedarlane Laboratories, Ltd.). Again after 2 to 3 weeks, 
the rabbits were given a third identical injection boost. 

A blood sample was collected from each animal from the central artery 
of the ear approximately 10 days following the final injection boost. 
Serum was prepared by allowing the sample to clot for 2 hours at 22°C 
followed by overnight incubation at 4°C, and clearing by centrifugation at 
5,000 rpm for 10 min. The antisera were diluted 1:100,000 in Tris- 
buffered Saline (pH 7.5) and carried through Western blot analysis to 
identify those sera containing anti-heparinase I, II or III antibodies. 
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Antibodies generated against wild type heparinase I t but not 
recombinant heparinase I, displayed a high degree of cross reactivity 
against other F. heparinum proteins. This was likely due to the presence 
of an antigenic post-translational modification common to F. heparinum 
5 proteins but not found on proteins synthesized in £. coli. To explore this 
further, recombinant heparinase I was immobilized onto Sepharose beads 
and packed into a chromatography column. Purified anti-heparinase I 
(wild type) antibodies were loaded onto the column and the unbound 
fraction collected. Bound antibodies were eluted in 0.1 M glycine, pH 2.0. 

1 0 IgG was found in both the unbound and bound fractions and subsequently 

used in Western blot experiments. Antibody isolated from the unbound 
fraction non-specifically recognized F. heparinum proteins but no longer 
detected recombinant heparinase I (E. coli), while the antibody isolated 
from the bound fraction only recognized heparinase I, whether synthesized 
15 in F. heparinum or E. coli. This result indicated that, as hypothesized, two 
populations of antibodies are formed by exposure to the wild-type 
heparinase I antigen: one specific for the protein backbone and the other 
recognizing a post-translationally modified moiety common to F. 
heparinum proteins. 

2 0 This finding provides both a means to purify specific anti-heparinase 

antibodies and a tool for characterizing the wild-type heparinase I protein. 

EXAMPLE 4: Construction of a F. heparinum Gene Library 

A Flavobacterium heparinum chromosomal DNA library was 
2 5 constructed in lambda phage DASHII. 0.4 ug of F. heparinum chromosomal 
DNA was partially digested with restriction enzyme Sau3A to produce a 
majority of fragments around 20 kb in size, as described in Maniatis, et al, 
Molecular Cloning Manual, Cold Spring Harbor (1982). This DNA was 
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phenol/chloroform extracted, ethanol precipitated, ligated with XDASHII 
arms and packaged with packaging extracts from a XDASHll/BamHl 
Cloning Kit (Stratagene, La Jolla, CA). The library was titered at 
approximately 10-5 p f u /ml after packaging, amplified to 10-8 p f u / m i by 
the plate lysis method, and stored at -70°C as described by Silhavy, T.J., et 
al.in Experiments in Molecular Genetics, Cold Spring Harbor Laboratory, 
1992. 

The F. heparinum chromosomal library was titered to about 300 
pfu/plate, overlaid on a lawn of E. coli , and allowed to transfect the cells 
overnight at 37°C, forming plaques. The phage plaques were transferred 
to nitrocellulose paper, and the phage DNA bound to the filters, as 
described in Maniatis, et al., ibid. 

EXAMPLE 5: A Modified Ribosome Binding Region for 
the Expression of Flavobacterium heparinum 
Glycosaminoglycan Lyases 

The gene for the mature heparinase I protein was cloned into the 
EcoRl site of the vector, pB9, where its expression was driven by two 
repeats of the tac promoter (from expression vector, pKK223-3, Brosius, 
and Holy, Proc. Natl. Acad. Sci. USA 81: 6929-6933 (1984)). In this vector, 
pBhep, the first codon, ATG, for heparinase 1 is separated by 10 
nucleotides from a minimal Shine-Dalgarno sequence AGGA (Shine and 
Dalgarno, Proc. Natl. Acad. Sci. USA 77:1342-1346 (1974)), Figure 1. This 
construct was transformed into the E. coli strain, JM109, grown at 37<> C 
and induced with ImM IPTG, 2 hours before harvesting. Cells were lysed 
by sonication, the cell membrane fraction was pelleted and the 
supernatant was saved. The membrane fraction was resuspended in 6M 
guanidine-HCl in order to solubilize inclusion bodies containing the 
recombinant heparinase I enzyme. The soluble heparinase I was refolded 
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by diluting in 20mM phosphate buffer. The enzyme activity was 
determined in the refolded pellet fraction, and in the supernatant fraction. 
Low levels of activity were detected in the supernatant and the pellet 
fractions. Analysis of the fractions by SDS-PAGE indicated that both 
fractions may contain minor bands corresponding to the recombinant 
heparinase I. 

In an attempt to increase expression levels from pBhep, two 
mutations were introduced as indicated in Figure 1. The mutations were 
produced to improve the level of translation of the heparinase I mRNA by 
increasing the length of the Shine-Dalgarno sequence and by decreasing 
the distance between the Shine-Dalgarno sequence and the ATG-start site. 
Using PCR, a single base mutation convening an A to a G improved the 
Shine-Dalgarno sequence from a minimal AGGA sequence to AGGAG while 
decreasing the distance between the Shine-Dalgarno sequence and the 
translation start site from 10 to 9 base pairs. This construct was named 
pGhep. In the second construct, pA4hep, 4 nucleotides (AACA) were 
deleted using PCR, in order to lengthen the Shine-Dalgarno sequence to 
AGGAG as well as moving it to within 5 base pairs of the ATG-start site. 

The different constructs were analyzed as described above. Refolded 
pellets from E. coli transformed with pGhep displayed approximately a 7X 
increase in heparinase I activity, as compared to refolded pellets from E. 
coli containing pBhep. On the other hand, £. coli containing pA4hep 
displayed 2-3 times less activity than the pBhep containing E. coli. The 
levels of heparinase 1 activity in the supernatants were similar. 

Plasmid, pBhep, was digested with £coRI and treated with SI 
nuclease to form blunt-ended DNA. The plasmid DNA was then digested 
with B am HI and the single-stranded ends were made double-stranded by 
filling-in with Klenow fragment. The blunt-end DNA was ligated and 
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transformed into E. coli strain FTB1. A plasmid which contained a unique 
Bamm site and no heparinase I gene DNA was purified from a kanamycin 
resistant colony and was designated plasmid, pGB. DNA sequence analysis 
revealed that plasmid pGB contained the modified ribosome binding site, 
shown in Figure 1. 

EXAMPLE 6: Nucleic Acid Encoding Heparinase II 

Four "guessmer" oligonucleotides were designed using information 
from two peptide sequences 2A and 2B and use of the consensus codons 
for Flavobacterium, shown in Table 3. These were: 
5'-GAATTCCCTGAGATGTACAATCTGGCCGC-3' (SEQU ID NO:l 1), 
5-CCGGCAGCCAGATTGTACATTTCAGG-3' (SEQU ID NO: 12), 
5'-AAACCCGCCGACATTCCCGAAGTAAAAGA-3' (SEQU ID NO:13), and 
5'-CGAAAGTCTTTTACTTCGGGAATGTCGGC-3' (SEQU ID NO: 14), 
named 2-1, 2-2, 2-3 and 2-4, respectively. The oligonucleotides were 
synthesized with a Bio/CAN (Mississauga, Ontario) peptide synthesizer. 
Pairs of these oligonucleotides were used as primers in PCR reactions. F . 
heparinum chromosomal DNA was digested with restriction endonucleases 
Sail, Xbal or Notl, and the fragmented DNA combined for use as the 
template DNA. Polymerase chain reaction mixtures were produced using 
the DNA Amplification Reagent Kit (Perkin Elmer Cetus, Norwalk, CT). The 
PCR amplifications were carried out in 100 ul reaction volume containing 
50 mM KC1, 10 mM Tris HQ, pH 9, 0.1% Triton X-100, 1.5 mM MgCl 2 , 0.2 
mM of each of the four deoxyribose nucleotide triphosphates (dNTPs), 100 
pmol of each primer, 10 ng of fragmented F. heparinum genomic DNA and 
2.5 units of Taq polymerase (Bio/CAN Scientific Inc., Mississauga, Ontario). 
The samples were placed on an automated heating block (DNA thermal 
cycler, Barnstead/Thermolyne Corporation, Dubuque, IA) programmed for 
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step cycles of: denaturation temperature 92°C (1 minute), annealing 
temperatures of 37°C, 42°C or 45°C (1 minute) and extension temperature 
72°C (2 minutes). These cycles were repeated 35 times. The resulting PCR 
products were analyzed on a 1.0% agarose gel containing 0.6 ug/ml 
5 ethidium bromide, as described by Maniatis, et ai, ibid. DNA fragments 
were produced by oligonucleotides 2-2 and 2-3. The fragments, 250 bp 
and 350 bp in size, were first separated on 1% agarose gel electrophoresis, 
and the DNA extracted from using the GENECLEAN I kit (Bio/CAN Scientific, 
Mississauga, Ontario). Purified fragments were ligated into pTZ/PC (Tessier 
1 0 and Thomas, unpublished) previously digested with Noll, Figure 2, and the 
ligation mixture used to transform E. coli FTB1, as described in Maniatis et 
al.,ibid. All restriction enzymes and T4 DNA ligase were purchased from 
New England Biolabs (Mississauga, Ontario). 

Strain FTB1 was constructed in our laboratory. The F episome from 

1 5 the XL-1 Blue £. coli strain (Stratagene, La Jolla, CA), which carries the lac 

II repressor gene and produces 10 times more lac repressor than wild 
type E. coli, was moved, as described by J. Miller, Experiments in Molecular 
Genetics, Cold Spring Harbor Laboratory (1972), into the TBI E. coli strain, 
described by Baker, T.A., et al.,Proc. Natl. Acad. Sci. 57:6779-6783 (1984). 

2 0 The FTB1 background permits a more stringent repression of transcription 

from plasmids carrying promoters with a lac operator (i.e. lac and Taq 
promoters). Colonies resulting from the transformation of FTB1 were 
selected on LB agar containing ampicillin and screened using the 
blue/white screen provided by X-gal and IPTG included in the agar 
2 5 medium, as described by Maniatis, et al., ibid. Transformants were 

analyzed by colony cracking and mini-preparations of DNA were made for 
enzyme restriction analysis using the RPM kit (Bio/CAN Scientific Inc., 
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Mississauga, Ontario). Ten plasmids contained inserts of the correct size, 
which were released upon digestion with EcoRl and Hindlll. 

DNA sequencing revealed that one of the plasmids, pCE14, contained 
a 350 bp PCR fragment had the expected DNA sequence as derived from 
5 peptide 2C. DNA sequences were determined by the dideoxy-chain 

termination method of Sanger et al.,Proc. Natl. Acad. Sci. 74:5463-5467 
(1978). Sequencing reactions were carried out with the Sequenase Kit (U.S. 
Biochemical Corp., Cleveland, Ohio) and 35 S -dATP (Amersham Canada Ltd., 
Oakville, Ontario, Canada), as specified by the supplier. 
1 0 The heparinase II gene was cloned from a F. heparinum chromosomal 

DNA library, Figure 2, constructed as described above. Ten plaque- 
containing filters were hybridized with the DNA probe, produced from the 
gel purified insert of pCE14, which was labeled using a Random Labeling 
Kit (Boehringer Mannheim Canada, Laval, Quebec). Plaque hybridization 
1 5 was carried out, as described in Maniatis et al, ibid,, at 65°C for 16 hours 
in a Tek Star hybridization oven (Bio/CAN Scientific, Mississauga, Ontario). 
Subsequent washes were performed at 65°C: twice for 15 min. in 2X SSC, 
once in 2X SSC/0.1% SDS for 30 min. and once in 0.5X SSC/0.1% SDS for 15 
min. Positive plaques were harvested using plastic micropipette tips and 
confirmed by dot blot analysis, as described by Maniatis et al., ibid. Six of 
the phages, which gave strong hybridization signals, were used for 
Southern hybridization analysis, as described by Southern, E.M., J. Mol. Biol. 
95:503-517 (1975). This analysis showed that one phage, HIIS, contained 
a 5.5 kb Xbal DNA fragment which hybridized with the probe. Cloning the 
5.5 kb Xbal fragment into the Xbal site of any of following vectors: pTZ/PC, 
pBluescript (Stratagene, La Jolla CA), pUC18 (described in Yanisch-PeiTon 
et al.. Gene 55:103-119 (1985)), and pOK12 (described in Vierra and 
Messing, Gene 700:189-194 (1991)), was unsuccessful, even though the 
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FTB1 background was used to repress plasmid promoter-derived 
transcription. Vector, pOK12, a low copy number plasmid derived from 
pACYC184 (approximately 10 copies/cell, Chang, A.C.Y. and Cohen, S.N., J. 
Bact. 734:1141-1156 (1978)) was used in an attempt to circumvent the 
5 toxic effects of a foreign DNA fragment in E. coli by minimizing the number 
of copies of the toxic foreign fragment. In addition, insertion of the entire 
Noil chromosomal DNA insert of the HIIS phage into plasmid pOK12 
plasmid, was unsuccessful. It was concluded that this region of F. 
heparinum chromosome imparts a negative-selective effect on any E. coli 
1 0 cells that harbor it. This toxic affect had not been observed previously 
with other F. heparinum chromosomal DNA fragments. 

A second strategy employed to circumvent the unexpected problem 
of F. heparinum DNA toxicity in £. coli was to digest the chromosomal DNA 
fragment with a restriction endonuclease which would divide the 

1 5 fragment, and if possible the heparinase II, gene into two pieces, Figure 2. 

These fragments could be cloned individually. DNA sequence analysis of 
the PCR insert in plasmid, pCE14, demonstrated that BamHl and EcoRl sites 
were present in the insert. Hybridization experiments also demonstrated 
that the BamHl digested F. heparinum DNA in phage HIIS produced two 

2 0 bands 1.8 and 5.5 kb in size. Analysis of hybridization data indicated that 

the 1.8 kb band contains the 5' end and the 5.5 kb band contains the 3' 
end of the gene. Furthermore, a 5 kb EcoRl F. heparinum chromosomal 
DNA fragment hybridized with the PCR probe. The 1.8, 5, and 5.5 kb 
fragments containing heparinase II gene sequences were inserted into 
2 5 pBluescript, as described above. Two clones, pBSIB6-7 and pBSIB6-21, 
containing the 5.5 kb BamHl insert in different orientations were isolated 
and one plasmid, pBSIB213, was isolated which contained the 1.8 kb 
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BamHl fragment. No clones containing the 5 kb EcoRl fragment were 
isolated, even though extensive screening of possible clones was done. 

The molecular weight of heparinase II protein is approximately 84 
kD. so the size of the corresponding gene would be approximately 2.4 kb. 
The 1.8 and 5.5 kb BamHl chromosomal DNA fragments could include the 
entire heparinase II gene. The plasmids pBSIB6-7, pBSIB6-21 and 
PBSIB2-13, Figure 2, were used to produce nested deletions with the 
Erase-a-Base system (Promega Biotec, Madison Wis.). These plasmids were 
used as templates for DNA sequence analysis using universal and reverse 
primers and oligonucleotide primers derived from known heparinase II 
sequence. Because parts of the gene were relatively G-C rich and 
contained numerous strong, secondary structures, the sequence analysis 
was, at times, performed using reactions in which the dGTP was replaced 
by dITP. Analysis of the DNA sequence, Figure 4, indicated that there was 
a single, continuous open reading frame containing codons for 772 amino 
acid residues, Figure 5. Searching for a possible signal peptide sequence 
using Geneworks (Intelligenetics, Mountain View. CA) suggested that there 
are two possible sites for processing of the protein into a mature form: Q- 
26 (glutamine) and D-30 (aspartate). N-terminal amino acid sequencing of 
deblocked, processed heparinase II indicated that the mature protein 
begins with Q-26, and contains 747 amino acids with a calculated 
molecular weight of 84,545 Daltons, Figure 5. 



EXAMPLE 7: Expression of Heparinase II in E. coli 
The vector, pGB, was used for heparinase II expression in E. coli, 
Figure3. pGB contains the modified ribosome binding region from pGhep, 
Figure 1, and a unique BamHl site, whereby expression of a DNA fragment 
inserted into this site is driven by a double tac promoter. The vector also 
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includes a kanamycin resistance gene, and the lac Iq gene to allow 
induction of transcription with IPTG. . Initially, a gel purified 5.5 kb BamHl 
fragment from pBSIB6-21 was ligated with BamHl digested pGB and 
transformed into FTB1, which was selected on LB agar with kanamycin. 
Six of the resulting colonies contained plasmids with inserts in the correct 
orientation for expression of the open reading frame. Pstl digestion and 
religation of one of the plasmids, forming pGBIID, deleted 3.5 kb of the 5.5 
kb BamHl fragment and removed a BamHl site leaving only one BamHl 
site directly after the Shine-Dalgarno sequence. Finally, two synthetic 
oligonucleotides were designed: 5 -TGAGGATTCATGCAAACCAAGGCCGATGT 
GGTTTGGAA-3' (SEQU ID NO: 15), and 5 -GGAGGATAACCACATTCGAGCATT-3' 
(SEQU ID NO: 16) for use in a PCR to produce a fragment containing a BamHl 
site and an ATG start codon upstream of the mature protein encoding 
sequence and a downstream BamHlshc, Figure 3. Lambda clone HII-I, 
isolated at the same time as lambda clone HIIS, was used as template DNA. 

Cloning the blunt-end PCR product into pTZ/PC was unsuccessful, 
using FTB1 as the host. Cloning the BamHl digested PCR product into the 
BamHl site of pBluescript, again using FTB1 as the host, resulted in the 
isolation of 2 plasmids containing the PCR fragment, after screening of 150 
possible clones. One of these, pBSQTK-9, which was sequenced with 
reverse and universal primers, contained an accurate reproduction of the 
DNA sequence from the heparinase II gene. The BamHl digested PCR 
fragment from pBSQTK-9 was inserted into the BamHl site of pGBIID in 
such orientation that the ATG site was downstream of the Shine-Dalgarno 
sequence. This construct, pGBH2, placed the mature heparinase II gene 
under control of the tac promoters in pGB, Figure 3. Strain E. coli 
FTBl(pGBH2) was grown in LB medium containing 50 ug/ml kanamycin at 
37°C for 3 h. Induction of the tac promoter was achieved by adding 1 
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mmol IPTG and the culture placed at either room temperature or 30°C. 
Heparin and heparan sulfate degrading activity was measured in the 
cultures after growth for 4 hours using the method described by Yang et 
al.,ibid. Heparin degrading activities of 0.36 and 0.24 IU/mg protein and 
5 heparan sulfate degrading activities of 0.49 and 0.44 IU/mg protein were 
observed at room temperature and 30°C, respectively. 



EXAMPLE 8: Nucleic Acid Encoding Heparinase III 

The amino acid sequence information obtained from peptides 
1 0 derived from heparinase III, Figure 9, purified as described herein, 

reverse translated into highly degenerate oligonucleotides. Therefore, a 
cloning strategy relying on the polymerase chain reaction amplification of 
a section of the heparinase III gene, using oligonucleotides synthesized on 
the basis of amino acid sequence information, required eliminating some of 
15 the DNA sequence possibilities. An assumed codon usage was calculated 
based on known DNA sequences for genes from other Flavobacterium 
species. Sequences for 17 genes were analyzed and a codon usage table 
was compiled, Table 3. 

Four oligonucleotides were designed by choosing each codon 
2 0 according to the codon usage table. These were: 5-GAATTCCATCAGTTTCAG 
CCGCATAAA-3' (SEQU ID NO: 17), S'-GAATTCTTTATGCGGCTGAAACTGATG-S' 
(SEQU ID NO:18), 5'-GAATTCCCGCCGGGCGAATTTCATGC-3' (SEQU ID NO: 19) 
and S'-GAATTCGCATGAAATTCGCCCGGCGG-S' (SEQU ID NO:20), and were 
named oligonucleotides 3-1, 3-2, 3-3 and 3-4, respectively. These 
2 5 oligonucleotides were used in all possible combinations, in an attempt to 
amplify a portion of the heparinase III gene using the polymerase chain 
reaction. The PCR amplifications were carried out as described above. 
Cycles of: denaturation temperature 92° C (1 minute), annealing 
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temperatures ranging from 37° to 55° C, (1 minute) and extension 
temperature 72° C (2 minutes) were repeated 35 times. Analysis of the 
PCR reactions as described above demonstrated that no DNA fragments 
were produced by these experiments. 
5 A second set of oligonucleotides was synthesized and was comprised 

of 32 base sequences, in which the codon usage table was used to guess the 
third position of only half of the codons. The nucleotides within the 
parentheses indicate degeneracies of two or four bases at a single site. 
These were: 

1 0 5 f <}G(ACGT)GAATTTCCATG^ (SEQU 
ID NO:21), 

5*-GT(ACGT)CCATT(AG)TC(ACGT)G^^ (SEQU 
ID NO:22), 

S^GTCACGT^ATCAGTTCCT (SEQU 

1 5 ID NO:23), and 

5 , -CCCATA(ACGT)CCTTTATG( 

(SEQU ID NO:24), and were named oligonucleotides 3-5, 3-6, 3-7 and 3-8, 
respectively. These oligonucleotides were used in an attempt to amplify a 
portion of the heparinase III gene using the polymerase chain reaction, 

2 0 and the combination of 3-6 and 3-7 gave rise to a specific 983 bp PCR 

product. An attempt was made to clone this fragment by blunt end 
ligation into £. coli vector, pBluescript, as well as two specifically designed 
vectors for the cloning of PCR products, pTZ/PC and pCRII from the TA 
cloning TM ^ (inVitrogen Corporation, San Diego, CA). All of these 
2 5 constructs were transformed into the FTB1 E. coli strain. Transformants 
were first analyzed by colony cracking, and subsequently mini- 
preparations of DNA were made for enzyme restriction analysis. No clones 
containing this PCR fragment were isolated. 
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A third set of oligonucleotides was synthesized incorporating BamHl 
endonuclease sequences on the ends of the 3-6 and 3-7 oligonucleotide 
sequences. A 999 base pair DNA sequence was obtained using the 
polymerase chain reaction with F. heparinum chromosomal DNA as the 
target. Attempts were made to clone the amplified DNA into the BamHl 
site of the high copy number plasmid pBluescript and the low copy 
number plasmids pBR322 and pACYC184. All of these constructs were 
again transformed into the FTB1 E. coli strain. More than 500 candidates 
were screened, yet no transformants containing a plasmid harboring the F . 
heparinum DNA were obtained. Once again, it was concluded that this 
region of F. heparinum chromosome imparts a negative-selective effect on 
E. coli cells that harbor it. 

As in the case for isolation of the heparinase II gene, the PCR 
fragment was split in order to avoid the problem of foreign DNA toxicity. 
Digestion of the 981 bp fiam//I-digested heparinase III PCR fragment with 
restriction endonuclease Clal produced two fragments of 394 and 587 bp. 
The amplified F. heparinum region was treated with Clal and the two 
fragments separated by agarose gel electrophoresis. The 587 and 394 base 
pair fragments were ligated separately into plasmid pBluescript that had 
been treated with restriction endonucleases BamHl and Clal. In addition, 
the entire 981 bp PCR fragment was purified and ligated into BamHl cut 
pBluescript. The ligated plasmids were inserted into the XL-1 Blue E. coli. 
Transformants containing plasmids with inserts were selected on the basis 
of their ability to form white colonies on LB-agar plates containing X-gal, 
IPTG and 50 ug/ml ampicillin, as described by Maniatis. Plasmid pFBl 
containing the 587 bp F. heparinum DNA fragment and plasmid pFB2 
containing the entire 981 base pair fragment were isolated by this method. 
The XL-1 Blue strain, which, like strain FTB1, contains the lac M repressor 
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gene on an F" episome, allowed for stable maintenance of the complete 
BamHl PCR fragment, unlike FTB1. The reason for this discrepancy is not 
apparent from the genotypes of the two strains (i.e., both are rec A, etc.). 
DNA sequence analysis of the F heparinum DNA in plasmid pFBl 
5 showed that it contained a sequence encoding peptide Hep3-B while the F 
heparinum insert in plasmid pFB2 contained a DNA sequence encoding 
peptides Hep3-D and Hep3-B, Figure 9. This analysis confirmed that these 
inserts were part of the gene encoding heparinase III. 

The PCR fragment insert in plasmid pFBl was labeled with 32p_ATP 
1 0 using a Random Primed DNA Labeling kit (Boehringer Mannheim, Laval, 

Quebec), and was used to screen the F. heparinum XDASHII library, Figure 
6, constructed as described herein. The lambda library was plated out to 
obtain approximately 1500 plaques, which were transferred to 
nitrocellulose filters (Schleicher & Schuel, Keene, NH). The PCR probe was 

1 5 purified by ethanol precipitation. Plaque hybridization was carried out 

using the conditions described above. Eight positive lambda plaques were 
identified. Lambda DNA was isolated from lysed bacterial cultures as 
described in Maniatis and further analyzed by restriction analysis and by 
Southern blotting using a Hybond-N nylon membrane (Amersham 

2 0 Corporation, Arlington Heights, IL) following the protocol described in 

Maniatis. A 2.7 kilobase HindlU fragment from lambda plaque #3, which 
strongly hybridized to the PCR probe, was isolated and cloned in 
pBluescript, in the XL-1 Blue £. coli background, to yield plasmid 
pHindlllBD, Figure 6. This clone was further analyzed by DNA sequencing. 
2 5 The sequence data was obtained using successive nested deletions of 

pHindlUBD generated with the Erase-a-Base System (Promega Corporation, 
Madison, WI) or sequenced using synthetic oligonucleotide primers. 
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Sequence analysis revealed a single continuous open reading frame, 
without a translational termination codon, of 1929 base pairs, 
corresponding to 643 amino acids. Further screening of the lambda library 
led to the identification of a 673 bp Kpnl fragment which was similarly 
cloned into the Kpnl site of pBluescript, creating plasmid pFB4. The 
termination codon was found within the Kpnl fragment adding an extra 51 
base pairs to the heparinase III gene and an additional 16 amino acid to 
the heparinase III protein. The complete heparinase III gene was later 
found to be included within a 3.2 kilobase Pstl fragment from lambda 
plaque #118. The complete heparinase III gene from Flavobacterium is 
thus 1980 base pairs in length, Figure 8, and encodes a 659 amino acid 
protein, Figure 9. N-terminal amino acid sequencing of deblocked, 
processed heparinase III indicated that the mature protein begins with Q- 
25, and contains 635 amino acids with a calculated molecular weight of 
73,135 Daltons, Figure 9. 

EXAMPLE 9: Expression of Heparinase III in E. coli 
PCR was used to generate a mature, truncated heparinase III gene, 
which had 16 amino acids deleted from the carboxy-terminus of the 
protein. An oligonucleotide comprised of 5 -CGCGGATCCATGCAAAGCT 
CTTCCATT-3• (SEQU ID NO:25) was designed to insert an ATG start site 
immediately preceding the codon for the first amino acid (Q-25) of mature 
heparinase III, while an oligonucleotide comprised of 5'-CGCGGATCCTCA 
AAGCTTGCCTTTCTC-3' (SEQU ID NO:26), was designed to insert a 
termination codon after the last amino acid of the heparinase III gene on 
the 2.7 kb Hindlll fragment. Both oligonucleotides also contained a BamHl 
site. Plasmid p///*rfIIIBD was used as the template in a PCR reaction with 
an annealing temperature of 50°C. A specific fragment of the expected 
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size, 1857 base pairs, was obtained. This fragment encodes a protein of 
620 amino acids with a calculated MW of 71,535 Da. It was isolated and 
inserted in the BamH\ site of the expression vector pGB. This construct 
was named pGB-H3A3\ Figure 7. 

To add the missing 3' region of heparinase III, the BspEl/Sall 
restriction fragment from pGB-H3A3' was removed and replaced with the 
BspEl/Sall fragment from pFB5. The construct containing the complete 
heparinase III gene was named pGBH3, Figure 7. Recombinant heparinase 
III is a protein of 637 amino acids with a calculated molecular weight of 
73,266 Daltons. E. coli strain XL-1 Blue(pGBH3) was grown at 37°C in LB 
medium containing 75 ug/ml kanamycin to an OD 6 oo of 0.5, at which point 
the tac promoter from pGB was induced by the addition of 1 mM IPTG. 
Cultures were grown an additional 2-5 hours at either 23° C, 30° C or 37° C. 
The cells were cooled on ice, concentrated by centrifugation and 
resuspended in cold PBS at 1/1 0th the original culture volume. Cells were 
lysed by sonication and cell debris removed by centrifugation at 10,000 x 
g for 5 minutes. The pellet and supernatant fractions were analyzed for 
heparan sulfate degrading (heparinase III) activity. Heparan sulfate 
degrading activities of 1.29, 5.27 and 3.29 IU/ml were observed from 
cultures grown at 23°, 30° and 37° C, respectively. 

The present invention describes a methodology for obtaining highly 
purified heparin and heparan sulfate degrading proteins by expressing the 
genes for these proteins in a suitable expression system and applying the 
steps of cell disruption, cation exchange chromatography, affinity 
chromatography and hydroxylapatite chromatography. Variations of these 
methods will be obvious to those skilled in the art from the foregoing 
detailed description of the invention. Such modifications are intended to 
come within the scope of the appended claims. 
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TABLE 1 


Purification of 


heparinase 


enzymes from 




Flavobacterium 


heparinum 


fermentations 




sample 


activity 


specific activity 


yield 




(IU) 




(%) 


fermentation 








heparin degrading 


39,700 


1.06 


100 


heparan sulfate degrading 


75,400 


ND 


100 


osmolate 








heparin degrading 


15,749 


ND 


40 


heparan sulfate degrading 


42,000 


ND 


56 


cation exchange 








heparin degrading 


12,757 


ND 


32 


heparan sulfate degrading 


27,540 


ND 


37 


cellufine sulfate 








heparin degrading 


8,190 


ND 


2 1 


heparan sulfate degrading 


9,328 


30.8 


1 2 


hydroxylapatite 








heparinase 1 


7,150 


1 15.3 


1.8 


heparinase II 


2,049 


28.41 


3 


heparinase III 


5,150 


44.46 


7 
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TABLE 2 





Propertied of 


nepannases from 






Flavobacteri 


um heparinum 




sample 


heparinase I 


heparinase II 


heparinase III 


Km (liM^ 


l/.o 


57.7 


29.4 


Kcat (s-l) 


157 


23.3 


1 64 


substrate 


H 


H and HS 


HS 


specificity 








N-terminal peptide 


OOKKSG 


OTKADV 


OSSSIT 


glycosylation 


yes 


yes 





H - heparin, HS - heparan sulfate 
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Codon usage table for Flavobacterium and Eschtrirhin colt 



amino acid 



codon(s) 



consensus codon 
E c °li Flavobacterium 



A 

C 

D 

E 

F 

G 

H 

I 

K 

L 

M 

N 
P 

Q 
R 



T 

V 
W 
Y 



GCT, GCC, GCG, GCA 
TGT, TGC 
GAT , GAC 
GAG , GAA 
TTC , TTT 

GGC , GGA, GGG , GGT 
CAC, CAT 
ATC , ATA, ATT 
AAA, AAG 

CTT, CTA, CTG, TTG, TTA, 
CTC 

ATG 

AAC , AAT 

CCC, CCT, CCA, CCG 

CAG , CAA 

CGT, AGA, CGC, CGA, AGG , 
CGG 

TCA, TCC, TCG, TCT, AGC , 
AGT 

ACG , ACC, ACT, ACA 

GTC, GTA, GTT, GTG 
TGG 

TAC , TAT 



GCT 
EITHER 
EITHER 

GAA 
EITHER 
GGC or GGT 

CAT 

ATA 

AAA 

CTG 

ATG 
AAC 
CCG 
CAG 
CGT 

TCT 

ACC or ACT 
GTT 
TGG 
EITHER 



GCC 
EITHER 
EITHER 

GAA 

TTT 

GGC 

CAT 

ATC 

AAA 

CTG 

ATG 
AAT 
CCG 
CAG 
CGC 



ACC or ACA 

p 

TGG 

TAT 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT(s) : IBEX TECHNOLOGIES and 
ZIMMERMANN , Joseph 



(ii) TITLE OF INVENTION: Nucleic Acid Sequences And Expression 

Systems For Heparinase II And Heparinase III Derived From 
Flavobacterium heparinum 

(iii) NUMBER OF SEQUENCES: 26 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Hale and Dorr 

{ B) STREET: 1455 Pennsylvania Avenue, N.W. 

(C) CITY: Washington, D.C. 

(E) COUNTRY: U.S.A. 

(F) ZIP: 20004 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US95/073 91 

(B) FILING DATE: 09-JUNE-1995 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/258,639 
(B) FILING DATE: 10 JUNE 1994 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: BAKER, Hollie L. 

(B) REGISTRATION NUMBER: 31,321 

(C) REFERENCE /DOCKET NUMBER: 104385 . 116PCT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (202)942-8400 

(B) TELEFAX: (202)942-8484 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2339 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
ATGAAAAGAC AATTAT AC CT GTATGTGATT TTTGTTGTAG TTGAACTTAT GGTTTTTACA 
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ACAAAGGGCT ATTCCCAAAC CAAGGCCGAT GTGGTTTGGA AAGACGTGGA TGGCGTATCT 
ATGCCCATAC CCCCTAAGAC CCACCCGCGT TTGTATCTAC GTGAGCAGCA AGTTCCTGAC 
CTGAAAAACA GGATGAACGA CCCTAAACTG AAAAAAGTTT GGGCCGATAT GATCAAGATG 24 0 

CAGGAAGACT GGAAGCCAGC TGATATTCCT GAAGTTAAAG ACTTTCGTTT TTATTTTAAC 3 00 

CAGAAAGGGC TTACTGTAAG GGTTGAACTA ATGGCCCTGA ACTATCTGAT G AC CAAGGAT 
CCAAAGGTAG GACGGGAAGC CATCACTTCA ATTATTGATA CCCTTGAAAC TG C AACTTTT 
AAACCAGCAG GTGATATTTC GAGAGGGATA GTGATATTTC GAGAGGGATA GGCCTGTTTA 
TGGTTACAGG GGCCATTGTG TATGACTGGT GCTACGATCA GCTGAAACCA GAAGAGAAAA 54 0 

CACGTTTTGT GAAGGCATTT GTGAGGCTGG CCAAAATGCT CGAATGTGGT TATCCTCCGG 600 
TAAAAGACAA GTCTATTGTT GGGCATGCTT CCGAATGGAT GATCATGCGG GACCTGCTTT 660 
CTGTAGGGAT TGCCATTTAC GATGAATTCC CTGAGATGTA TAACCTGGCT GCGGGTCGTT 
TTTTCAAAGA ACACCTGGTT GCCCGCAACT GGTTTTATCC CTCGCATAAC TACCATCAGG 
GTATGTCATA CCTGAACGTA AGATTTACCA ACGACCTTTT TGCCCTCTGG ATATTAG AC C 
GGATGGGCGC TGGTAATGTG TTTAATCCAG GGCAGCAGTT TATCCTTTAT GACGCGATCT 
ATAAACGCCG CCCCGATGGA CAGATTTTAG CAGGTGGAGA TGTAGATTAT TCCAGGAAAA 960 
AACCAAAATA TTATACGATG CCTGCATTGC TTGCAGGTAG CTATTATAAA GATGAATACC 1020 
TTAATTACGA ATTCCTGAAA GATCCCAATG TTGAGCCACA TTGCAAATTG TTCGAATTTT 
TATGGCGCGA TACCCAGTTG GGAAGTCGTA AGCCTGATGA TTTGCCACTT TCCAGGTACT 
CAGGATCGCC TTTTGGATGG ATGATTGCCC GTACCGGATG GGGTCCGGAA AGTGTGATTG 
CAGAGATGAA AGTCAACGAA TATTCCTTTC TTAACCATCA GCATCAGGAT GCAGGAGCCT 
TCCAGATCTA TTACAAAG G C CCGCTGGCCA TAGATGCAGG CTCGTATACA GGTTCTTCAG 
GAGGTTATAA CAGTCCGCAC AACAAGAACT TTTTTAAGCG G ACT ATTG C A CACAATAGCT 
TGCTGATTTA CGATCCTAAA GAAACTTTCA GTTCGTCGGG ATATG GTGGA AGTGACCATA 
CCGATTTTGC TGCCAACGAT GGTGGTCAGC GGCTGCCCGG AAAAGGTTGG ATTGCACCCC 1500 
GCGACCTTAA AGAAATGCTG GCAGGCGATT TCAGGACCGG CAAAATTCTT GCCCAGGGCT 1560 
TTGGTCCGGA TAACCAAACC CCTGATTATA CTTATCTGAA AGGAGACATT ACAGCAGCTT 1620 
ATTCGG C AAA AGTGAAGGAA GTAAAACGTT CATTTCTATT CCTGAACCTT AAGGATGCCA 168 0 

AAGTTCCGGC AGCGATGATC GTTTTTGACA AGGTAGTTGC TTC CAATCCT GATTTTAAGA 174 0 

AGTTCTGGTT GTTGCACAGT ATTG AG C AG C CTGAAATAAA GGGGAATCAG ATTAC CATAA 18 00 

AACGTACAAA AAACGGTGAT AGTGGGATGT TGGTGAATAC GGCTTTGCTG CCGGATGCGG 186 0 

CCAATTCAAA CATTACCTCC ATTGGCGGCA AGGGCAAAGA CTTCTGGGTG TTTGGTACCA 1920 



720. 
780 
840 
900 



1080 
1140 
1200 
1260 
1320 
1380 
1440 
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ATTATACCAA TGATCCTAAA CCGGGCACGG ATGAAGCATT GGAACGTGGA GAATGGCGTG 198 0 

TGGAAATCAC TCCAAAAAAG GCAGCAGCCG AAGATTACTA CCTGAATGTG ATACAGATTG 2 04 0 

CCG AC AATAC ACAGCAAAAA TTACACGAGG TGAAGCGTAT TGACGGTGAC AAGGTTGTTG 2100 

GTGTGCAGCT TGCTGACAGG ATAGTTACTT TTAGCAAAAC TTCAGAAACT GTTGATCGTC 216 0 

CCTTTGGCTT TTCCGTTGTT GGTAAAGGAA CATTCAAATT TGTGATGACC GATCTTTTAG 222 0 

CGGGTACCTG GCAGGTGCTG AAAGACGGAA AAATACTTTA TCCTGCGCTT TCTGCAAAAG 2 28 0 
GTGATGATGG ACCCCTTTAT TTTGAAGGAA CTGAAGGAAC CTACCGTTTT TTGAGATAA 



(2) INFORMATION FOR SEQ ID NO;2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 772 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO ; 2 : 

Met Lys Arg Gin Leu Tyr Leu Tyr Val lie Phe Val Val Val Glu Leu 
15 10 15 

Met Val Phe Thr Thr Lys Gly Tyr Ser Gin Thr Lys Ala Asp Val Val 
20 25 30 

Trp Lys Asp Val Asp Gly Val Ser Met Pro lie Pro Pro Lys Thr His 
35 40 45 

Pro Arg Leu Tyr Leu Arg Glu Gin Gin Val Pro Asp Leu Lys Asn Arg 
50 55 60 

Met Asn Asp Pro Lys Leu Lys Lys Val Trp Ala Asp Met lie Lys Met 
65 70 75 80 

Gin Glu Asp Trp Lys Pro Ala Asp lie Pro Glu Val Lys Asp Phe Arg 
85 90 95 

Phe Tyr Phe Asn Gin Lys Gly Leu Thr Val Arg Val Glu Leu Met Ala 
100 105 no 

Leu Asn Tyr Leu Met Thr Lys Asp Pro Lys Val Gly Arg Glu Ala lie 
115 120 125 

Thr Ser lie lie Asp Thr Leu Glu Thr Ala Thr Phe Lys Pro Ala Gly 
130 135 140 

Asp lie Ser Arg Gly lie Gly Leu Phe Met Val Thr Gly Ala lie Val 
145 150 155 160 

Tyr Asp Trp Cys Tyr Asp Gin Leu Lys Pro Glu Glu Lys Thr Arg Phe 
165 170 175 
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Val Lys Ala Phe Val Arg Leu Ala Lys Met Leu Glu Cys Gly Tyr Pro 
180 185 190 

Pro Val Lys Asp Lys Ser He Val Gly His Ala Ser Glu Trp Met He 
195 200 205 

Met Arg Asp Leu Leu Ser Val Gly He Ala He Tyr Asp Glu Phe Pro 
210 215 220 

Glu Met Tyr Asn Leu Ala Ala Gly Arg Phe Phe Lys Glu His Leu Val 
225 230 235 240 

Ala Arg Asn Trp Phe Tyr Pro Ser His Asn Tyr His Gin Gly Met Ser 
245 250 255 

Tyr Leu Asn Val Arg Phe Thr Asn Asp Leu Phe Ala Leu Trp He Leu 
260 265 270 

Asp Arg Met Gly Ala Gly Asn Val Phe Asn Pro Gly Gin Gin Phe lie 
275 280 285 

Leu Tyr Asp Ala He Tyr Lys Arg Arg Pro Asp Gly Gin He Leu Ala 
290 295 300 

Gly Gly Asp Val Asp Tyr Ser Arg Lys Lys Pro Lys Tyr Tyr Thr Met 
305 310 315 320 

Pro Ala Leu Leu Ala Gly Ser Tyr Tyr Lys Asp Glu Tyr Leu Asn Tyr 
325 330 335 

Glu Phe Leu Lys Asp Pro Asn Val Glu Pro His Cys Lys Leu Phe Glu 
340 345 350 

Phe Leu Trp Arg Asp Thr Gin Leu Gly Ser Arg Lys Pro Asp Asp Leu 
355 360 365 

Pro Leu Ser Arg Tyr Ser Gly Ser Pro Phe Gly Trp Met He Ala Arq 
370 375 380 

Thr Gly Trp Gly Pro Glu Ser Val He Ala Glu Met Lys Val Asn Glu 
385 390 395 400 

Tyr Ser Phe Leu Asn His Gin His Gin Asp Ala Gly Ala Phe Gin He 
405 410 415 

Tyr Tyr Lys Gly Pro Leu Ala He Asp Ala Gly Ser Tyr Thr Gly Ser 
420 425 430 

Ser Gly Gly Tyr Asn Ser Pro His Asn Lys Asn Phe Phe Lys Arg Thr 
435 440 445 

He Ala His Asn Ser Leu Leu He Tyr Asp Pro Lys Glu Thr Phe Ser 
450 455 460 

Ser Ser Gly Tyr Gly Gly Ser Asp His Thr Asp Phe Ala Ala Asn Asp 
465 470 475 480 

Gly Gly Gin Arg Leu Pro Gly Lys Gly Trp lie Ala Pro Arg Asp Leu 
485 490 495 

Lys Glu Met Leu Ala Gly Asp Phe Arg Thr Gly Lys He Leu Ala Gin 
500 505 510 
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Gly Phe Gly Pro Asp Asn Gin Thr Pro Asp Tyr Thr Tyr Leu Lys Gly 
515 520 525 

Asp He Thr Ala Ala Tyr Ser Ala Lys Val Lys Glu Val Lys Arg Ser 
530 535 540 

Phe Leu Phe Leu Asn Leu Lys Asp Ala Lys Val Pro Ala Ala Met He 
545 550 555 560 

Val Phe Asp Lys Val Val Ala Ser Asn Pro Asp Phe Lys Lys Phe Trp 
565 570 575 

Leu Leu His Ser He Glu Gin Pro Glu He Lys Gly Asn Gin He Thr 
580 585 590 

He Lys Arg Thr Lys Asn Gly Asp Ser Gly Met Leu Val Asn Thr Ala 
595 600 605 

Leu Leu Pro Asp Ala Ala Asn Ser Asn He Thr Ser He Gly Gly Lys 
610 615 620 

Gly Lys Asp Phe Trp Val Phe Gly Thr Asn Tyr Thr Asn Asp Pro Lys 
625 630 635 640 

Pro Gly Thr Asp Glu Ala Leu Glu Arg Gly Glu Trp Arg Val Glu He 
645 650 655 

Thr Pro Lys Lys Ala Ala Ala Glu Asp Tyr Tyr Leu Asn Val He Gin 
660 665 670 

He Ala Asp Asn Thr Gin Gin Lys Leu His Glu Val Lys Arg lie Asp 
675 680 685 

Gly Asp Lys Val Val Gly Val Gin Leu Ala Asp Arg lie Val Thr Phe 
690 695 700 

Ser Lys Thr Ser Glu Thr Val Asp Arg Pro Phe Gly Phe Ser Val Val 
705 710 715 720 

Gly Lys Gly Thr Phe Lys Phe Val Met Thr Asp Leu Leu Ala Gly lie 
725 730 735 

Trp Gin Val Leu Lys Asp Gly Lys lie Leu Tyr Pro Ala Leu Ser Ala 
740 745 750 

Lys Gly Asp Asp Gly Pro Leu Tyr Phe Glu Gly Thr Glu Gly Thr Tyr 
755 760 765 

Arg Phe Leu Arg 
770 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1980 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 3 : 



ATGACTACGA AAATTTTTAA AAGGATPA^t HTAT'P'P ptp ^T^^^mnnr^^ 


50 


ATCGTCGGGA AATATAPTTP PAPAAAPPTP TTrrtTr?\pr Ar-r-.* * *rr~n 
« A v-vj j. x-w^^rt i * /-iu i iu u>rtL./\>i/itjL. i L I 1 LLA 1 1 ALL AGGAAAGATT 


100 


iiuAiLrtL/ii ^>\AbLiibAb IATTCCGGAC tggaaaaggt TAATAAAGCA 


150 


biibiibtlb bLAALTATGA CGATGCGGCC AAAGCATTAC TGGCATACTA 


200 


^aubbAAMA Ab I AAbGCCA GGGAACCTGA TTTCAGTAAT GCAGAAAAGC 


250 


LibbLbAlAI ALbLCAGCCC ATAGATAAGG TTACGCGTGA AATGGCCGAC 


300 


^bbbu ibb i LLALLAbTT TCAACCGCAC AAAGGCTACG G CTATTTTG A 


350 


AAA bALAiCAACi GuCAGATGTG GCCGGTAAAA GACAATGAAG 


400 


irtLbL 1 ^^>^a bi lbLALLbT GTAAAATGGT GGCAGGCTAT GGCCCTGGTT 


450 


iHiLALbbiA bbbbLbAIbA AAAATATGCA AGAGAATGGG TATATCAGTA 


500 


LAbLbAi ibb bLLAbAAAAA ACCCATTGGG CCTGTCGCAG GATAATGATA 


550 


A A r r r r r rr2T , f^'™m rrrpppppTT" o^nrirnnmino^ ^ 

AAiiibiblb bLbbLLCCTT GAAGTGTCGG ACAGGGTACA AAGTCTTCCC 


600 


PPAAPPTTP A nfTTATTTPT A A APTPPOPn 

w^l-l, i i l->\ bbi lAi i ibl AAALTCGCCA GCCTTTACCC CAGCCTTTTT 


650 


AATGGAATTT TT* AABTZiP, TT rppappa a o a r»r^/^/^^» r™-n» m ,. _ 

nniuoAftiii x lAAALAbi 1 ALLALLAACA G G C C G ATT AT TTATCTACGC 


700 


ATTATGPPPA ATAPPPA A ap p?\pppTTTTiT tn-h/-^ _ ~, , » ^ 

o^uo/i Mk_>i(j(j(jAAAL LALLLr I J/TAT TTGAAGCCCA ACGCAACTTG 


750 


TTTGCAGGGG TATPTTTPPP TP ziattt a a a r» at>t>^ a „ _ , _ ____ — 
nivjv./ivjyjuu i i i ibAAl 1 1AAA CjAi l CACCAA GATGGAGGCA 


800 


AACCGGCATA TCGGTGCTGA APAPPPAPAT pa a a a a ap-a^ r"T~T~r Arn^.~^^ 


850 


ATGGGATGCA GTTTGAACTT TPAPPA ATTT arraTPTApp 

w a * j. vjrvi.v_ i j. i ibl ALL I bL CATC GAT 


900 


ATCTTCTTAA AGGCCTATGG TTCTGCAAAA PPAPTTAapp ttpa a a a a^a 


950 


ATTTCCGCAA TCTTATGTAC AAAPTPTAPA aaaTiTpRTr A-nr^r^onn^ ^ 


1000 


TCAGTATTTC ACTGCCAGAT TATAAPAPPP PTSTPrn'pp A ^^ r ™^ 7 . 

± arti/irt^ML.Lb biAibillbb Ab A TT CATG G 


1050 


ATTACAGATA AAA ATTTP AP PATPPPAnAr 1 Tr^r* r*r> a r*r*»-n ( -, < - 1/ -, / ^>-, / ~,_ 1/ -,^ 1 _„ 

" w-»u-r-i x « nrtrt "'L -l j. ^— rtvj o>i i oL^L-^L^Ab I X 1 LLCAGCT GGGCCCGGGT 


1100 


TTTPPPPPPA AiPPfiPPPPA T" A A A A T» TV rnrpm rrtr^ nrr* •» ^ » m 

iiiLi^bLA AALLAbbLLA TAAAATATTT TGCTACAGAT GGCAAACAAG 


1150 


GTAAGGCGCC TAACTTTTTA TCCAAAGCAT TGAGCAATGC AGGCTTTTAT 


1200 


ACGTTTAGAA GCGGATGGGA TAAAAATGCA ACCGTTATGG TATTAAAAGC 


1250 


CAGTCCTCCC GGGGAATTTC ATGCCCAGCC GGATAACGGG ACTTTTGAAC 


1300 


TTTTTATAAA GGGCAGAAAC TTTACCCCAG ACGCCGGGGT ATTTGTGTAT 


1350 


AGCGGCGACG AAGCCATCAT GAAACTGCGG AACTGGTACC GTCAAACCCG 


1400 


CATACACAGC ACG CTTACAC TCGACAATCA AAATATGGTC ATTAC C AAAG 


1450 
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CCCGGCAAAA 


CAAATGGGAA 


ACAGGAAATA 






15 00 


ACCAACCCAA 


GCTATCCGAA 


TCTGGACTAT 




I.M.L. 1 i 1 i LAI 


15 50 


CAACAAAAAA 


TACTTTCTGG 


TPATrGATAr; 






16 0 0 


gaaacctggg 


CGTACAfTfir: 






LLL1 (j I"I 1"I'C 


1650 


GATAAGAPAA 








Cj I AALAACCT 


1700 


GATGATCCAA 


TCGTTGAATG 


CGGACAGGAC 


CAGCCTCAAT 


GAAGAAGAAG 


1750 


GAAAGGTATC 


TTATGTTTAC 


AATAAGG AG C 


TGAAAAGACC 


TGCTTTCGTA 


1800 


TTTGAAAAGC 


CTAAAAAGAA 


TGCCGGCACA 


CAAAATTTTG 


TCAGTATAGT 


1850 


TTATCCATAC 


GACGGCC AG A 


AGGCTCCAGA 


GAT C AG CAT A 


CGGGAAAACA 


1900 


AGGGCAATGA 


TTTTGAGAAA 


GGCAAGCTTA 


ATCTAACCCT 


TACCATTAAC 


1950 


GGAAAACAAC 


AG CTTGTGTT 


GGTTCCTTAG 






1980 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 659 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Thr Thr Lys He Phe Lys Arg He He Val Phe Ala Val He Ala 
1 5 10 15 

Leu Ser Ser Gly Asn He Leu Ala Gin Ser Ser Ser He Thr Arq Lys 
20 25 30 

Asp Phe Asp His He Asn Leu Glu Tyr Ser Gly Leu Glu Lys Val Asn 
35 40 45 

Lys Ala Val Ala Ala Gly Asn Tyr Asp Asp Ala Ala Lys Ala Leu Leu 
50 55 60 

Ala Tyr Tyr Arg Glu Lys Ser Lys Ala Arg Glu Pro Asp Phe Ser Asn 
65 70 75 60 

Ala Glu Lys Pro Ala Asp lie Arg Gin Pro He Asp Lys Val Thr Arg 
85 90 95 

Glu Met Ala Asp Lys Ala Leu Val His Gin Phe Gin Pro His Lys Gly 
100 105 no 

Tyr Gly Tyr Phe Asp Tyr Gly Lys Asp He Asn Trp Gin Met Trp Pro 
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115 



120 



125 



Val Lys Asp Asn Glu Val Arg Trp Gin Leu His Arg Val Lys Trp Trp 
130 135 140 

Gin Ala Met Ala Leu Val Tyr His Ala Thr Gly Asp Glu Lys Tyr Ala 
145 150 155 160 

Arg Glu Trp Val Tyr Gin Tyr Ser Asp Trp Ala Arg Lys Asn Pro Leu 
165 170 175 

Gly Leu Ser Gin Asp Asn Asp Lys Phe Val Trp Arg Pro Leu Glu Val 
180 185 190 

Ser Asp Arg Val Gin Ser Leu Pro Pro Thr Phe Ser Leu Phe Val Asn 
195 200 205 

Ser Pro Ala Phe Thr Pro Ala Phe Leu Met Glu Phe Leu Asn Ser Tyr 
210 215 220 

His Gin Gin Ala Asp Tyr Leu Ser Thr His Tyr Ala Glu Gin Gly Asn 
225 230 235 240 

His Arg Leu Phe Glu Ala Gin Arg Asn Leu Phe Ala Gly Val Ser Phe 
245 250 255 

Pro Glu Phe Lys Asp Ser Pro Arg Trp Arg Gin Thr Gly He Ser Val 
260 265 270 

Leu Asn Thr Glu He Lys Lys Gin Val Tyr Ala Asp Gly Met Gin Phe 
275 280 285 

Glu Leu Ser Pro He Tyr His Val Ala Ala He Asp He Phe Leu Lys 
290 295 300 

Ala Tyr Gly Ser Ala Lys Arg Val Asn Leu Glu Lys Glu Phe Pro Gin 
305 310 315 320 

Ser Tyr Val Gin Thr Val Glu Asn Met He Met Ala Leu He Ser He 
325 330 335 

Ser Leu Pro Asp Tyr Asn Thr Pro Met Phe Gly Asp Ser Trp He Thr 
340 345 350 

Asp Lys Asn Phe Arg Met Ala Gin Phe Ala Ser Trp Ala Arg Val Phe 
355 360 365 

Pro Ala Asn Gin Ala He Lys Tyr Phe Ala Thr Asp Gly Lys Gin Glv 
370 375 380 

Lys Ala Pro Asn Phe Leu Ser Lys Ala Leu Ser Asn Ala Gly Phe Tyr 
385 390 395 400 

Thr Phe Arg Ser Gly Trp Asp Lys Asn Ala Thr Val Met Val Leu Lys 
405 410 415 

Ala Ser Pro Pro Gly Glu Phe His Ala Gin Pro Asp Asn Gly Thr Phe 
420 425 430 

Glu Leu Phe lie Lys Gly Arg Asn Phe Thr Pro Asp Ala Gly Val Phe 
435 440 445 
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Val Tyr Ser Gly Asp Glu Ala lie Met Lys Leu Arg Asn Trp Tyr Arg 
450 455 460 

Gin Thr Arg lie His Ser Thr Leu Thr Leu Asp Asn Gin Asn Met Val 
465 470 475 480 

lie Thr Lys Ala Arg Gin Asn Lys Trp Glu Thr Gly Asn Asn Leu Asp 
485 490 495 

Val Leu Thr Tyr Thr Asn Pro Ser Tyr Pro Asn Leu Asp His Gin Arg 
500 505 510 

Ser Val Leu Phe lie Asn Lys Lys Tyr Phe Leu Val lie Asp Arg Ala 
515 520 525 

He Gly Glu Ala Thr Gly Asn Leu Gly Val His Trp Gin Leu Lys Glu 
530 535 540 

Asp Ser Asn Pro Val Phe Asp Lys Thr Lys Asn Arg Val Tyr Thr Thr 
545 550 555 560 

Tyr Arg Asp Gly Asn Asn Leu Met He Gin Ser Leu Asn Ala Asp Arg 
565 570 575 

Thr Ser Leu Asn Glu Glu Glu Gly Lys Val Ser Tyr Val Tyr Asn Lys 
580 585 590 

Glu Leu Lys Arg Pro Ala Phe Val Phe Glu Lys Pro Lys Lys Asn Ala 
595 600 605 

Gly Thr Gin Asn Phe Val Ser He Val Tyr Pro Tyr Asp Gly Gin Lys 
610 615 620 

Ala Pro Glu He Ser He Arg Glu Asn Lys Gly Asn Asp Phe Glu Lys 
625 630 635 640 

Gly Lys Leu Asn Leu Thr Leu Thr He Asn Gly Lys Gin Gin Leu Val 
645 650 655 

Leu Val Pro 



(2) INFORMATION FOR SEQ ID NO: 5: 

<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

Glu Phe Pro Glu Met Tyr Asn Leu Ala Ala Gly Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Lys Pro Ala Asp He Pro Glu Val Lys Asp Gly Arg 

15 10 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 7: 

Leu Ala Gly Asp Phe Val Thr Gly Lys He Leu Ala Gin Gly Phe Gly 
15 10 15 

Pro Asp Asn Gin Thr Pro Asp Tyr Thr Tyr Leu 
20 25 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Leu He Lys Asn Glu Val Arg Trp Gin Leu His Arg Val Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Val Leu Lys Ala Ser Pro Pro Gly Glu Phe His Ala Gin Pro Asp Asn 
15 10 15 
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Gly Thr Phe Glu Leu Phe lie 
20 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Lys Ala Leu Val His Trp Phe Trp Pro His Lys Gly Tyr Gly Tyr Phe 
1 5 "10 15 

Asp Tyr Gly Lys Asp lie Asn 
20 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GAATTCCCTG AGATGTACAA TCTGGCCGC 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CCGGCAGCCA GATTGTACAT TTCAGG 2 6 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
AAACCCGCCG AGATTCCCGA AGTAAAAGA 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CGAAAGTCTT TTACTTCGGG AATGTCGGC 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TGAGGATTCA TGCAAACCAA GGCCGATGTG GTTTGGAA 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : doub 1 e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GGAGGATAAC CACATTCGAG CATT 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE; DNA (genomic) ' 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GAATTCCATC AGTTTCAGCC GC AT AAA 2 7 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GAATTCTTTA TGCGGCTGAA ACTGATG 2 7 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
GAATTCCCGC CGGGCGAATT TCATGC 2 6 

(2) . INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20; 

GAATTCGCAT GAAATTCGCC CGGCGG 26 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 29 base pairs 
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(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 1 : 
GGGAATTTCC ATGCCCAGCC GAAATGGAC 
(2) INFORMATION FOR SEQ ID NO: 22: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
GTCCATTTCG GCTGGGCATG AAATTCCC 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : doub 1 e 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 23: 
GTCATCAGTT CAGCCCATAA AGGTATGG 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 24: 
CCCATACCTT ATGGGCTGAA CTGATGAC 
(2) INFORMATION FOR SEQ ID NO:25: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
' (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 25: 
CGCGGATCCA TGCAAAGCTC TTCCATT 2 7 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
CGCGGATCCT CAAAGCTTGC CTTTCTC 2 7 
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We claim: 

1. A recombinant nucleic acid sequence which encodes heparinase II 
from Flavobacterium heparinum. 

2. The nucleic acid sequence of claim 1 comprising the sequence of 
SEQUIDNOrl. 

3. The nucleic acid sequence of claim 1 further comprising a nucleic 
acid sequence capable of directing the expression of said heparinase. 

4. The nucleic acid sequence of claim 3 comprising a modified ribosome 
binding region. 

5. A host cell transformed with a vector comprising the nucleic acid 
sequence of claim 3, said host cell being capable of heparinase II. 

6. The host cell of claim 5, wherein said host cell is E. coli. 

7. A recombinant nucleic acid sequence which encodes heparinase III 
from Flavobacterium heparinum. 

8. The nucleic acid sequence of claim 7 comprising the sequence of 
SEQU ID NO:3. 

9. The nucleic acid sequence of claim 7 further comprising a nucleic 
acid sequence capable of directing the .expression of said heparim 



lase. 



BNSDOOID .-WO af3aB3f.A' I > 



SUBSTITUTE SHEET (RULE 26) 



WO 95/34635 PCT/US95/07391 

57 

10. The nucleic acid sequence of claim 9 comprising a modified ribosome 
binding region. 

11. A host cell transformed with a vector comprising the nucleic acid 
sequence of claim 9, said host cell being capable of expressing heparinase 
III. 

12. The host cell of claim 11, wherein said host cell is E. coli. 

13. Isolated, recombinant heparinase II in substantially pure form. 

14. The heparinase II of claim 13 comprising the amino acid sequence of 
SEQU ID NO:2. 

15. Isolated, recombinant heparinase III in substantially pure form. 

16. The heparinase III of claim 15 comprising the amino acid sequence 
of SEQU ID NO:4. 

17. An expression vector for the expression of heparinases comprising a 
modified ribosome binding region containing a Shine-Dalgarno sequence, a 
spacer region between the Shine-Dalgarno sequence and the ATG start 
codon, and a recombinant nucleotide sequence encoding heparinase I, II or 
III. 

18. The expression vector of claim 17 wherein the Shine-Dalgarno 
sequence is 5 base pairs in length. 
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19. The expression vector of claim 17 wherein the spacer region between 
the Shine-Dalgarno sequence and the ATG start codon is 9 base pairs in 
length. 

20. A method of expressing genes from Flavobacterium species 
comprising constructing the expression vector of claim 17 and 
transforming a prokaryote host cell with said expression vector. 

21. The method of claim 20 wherein said expression vector encodes 
heparinase I. 

22. The method of claim 20 wherein said expression vector encodes 
heparinase II. 

23. The method of claim 20 wherein said expression vector encodes 
heparinase III. 

24. An antibody isolated from animals injected with a heparinase from F . 
heparinum which are specific for the amino acid sequences of the 
heparinase. 

25. The antibody of claim 24 wherein said heparinase is heparinase I. 

26. The antibody of claim 24 wherein said heparinase is heparinase II. 

27. The antibody of claim 24 wherein said heparinase is heparinase III. 
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28. An antibody isolated from animals injected with a heparinase which 
is specific for non-amino acid moities of post-translationally modified F. 
heparinum proteins. 

29. The polyclonal antibody of claim 28 wherein said heparinase is 
heparinase I. 

30. The polyclonal antibody of claim 28 wherein said heparinase is 
heparinase II. 

31. The polyclonal antibody of claim 28 wherein said heparinase is 
heparinase III. 

32. A method of purifying heparinases from Flavobacterium heparinum 
comprising the steps of culturing F. heparinum cells, disrupting the cells, 
and performing cation exchange chromatography, affinity chromatography 
and hydroxylapatite chromatography. 
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ATGAAAAGAC 


AATTATACCT GTATGTGATT TTTGTTGTAG TTGAACTTAT 


GGTTTTTACA 


60 


ACAAAGGGCT 


ATTCCCAAAC CAAGGCCGAT GTGGTTTGGA AAGACGTGGA 


TGGCGTATCT 


120 


ATGCCCATAC 


CCCCTAAGAC CCACCCGCGT TTGTATCTAC GTGAGCAGCA 


AGTTCCTGAC 


180 


CTGAAAAACA 


GGATGAACGA CCCTAAACTG AAAAAAGTTT GGGCCGATAT 


GATCAAGATG 


240 


CAGGAAGACT 


GGAAGCCAGC TGATATTCCT GAAGTTAAAG ACTTTCGTTT 


TTATTTTAAC 


300 


CAGAAAGGGC 


TTACTGTAAG GGTTGAACTA ATGGCCCTGA ACTATCTGAT 


GACCAAGGAT 


360 


CCAAAGGTAG 


GACGGGAAGC CATCACTTCA ATTATTGATA CCCTTGAAAC 


TGCAACTTTT 


420 


AAACCAGCAG 


GTGATATTTC GAGAGGGATA GGCCTGTTTA TGGTTACAGG 


GGCCATTGTG 


480 


TATGACTGGT 


GCTACGATCA GCTGAAACCA GAAGAGAAAA CACGTTTTGT 


GAAGGCATTT 


540 


GTGAGGCTGG 


CCAAAATGCT CGAATGTGGT TATCCTCCGG TAAAAGACAA 


GTCTATTGTT 


600 


GGGCATGCTT 


CCGAATGGAT GATCATGCGG GACCTGCTTT CTGTAGGGAT 


TGCCATTTAC 


660 


GATGAATTCC 


CTGAGATGTA TAACCTGGCT GCGGGTCGTT TTTTCAAAGA 


ACACCTGGTT 


720 


GCCCGCAACT 


GGTTTTATCC CTCGCATAAC TACCATCAGG GTATGTCATA 


CCTGAACGTA 


780 


AGATTTACCA 


ACGACCTTTT TGCCCTCTGG ATATTAGACC GGATGGGCGC 


TGGTAATGTG 


840 


TTTAATCCAG 


rrr*kf*f*kr*TT tatpptttat pa^yvpatpt ata a iirrrrr 
GGCAbLAb II 1 A 1 1 1 I A I bAUAA>A 1 L 1 A 1 AAAbbLLb 


LOLUb A 1 A 


onn 
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CAGATTTTAG 


CAGGTGGAGA TGTAGATTAT TCCAGGAAAA AACCAAAATA 


TTATACGATG 


960 


CCTGCATTGC 


TTGCAGGTAG CTATTATAAA GATGAATACC TTAATTACGA 


ATTCCTGAAA 


1020 


GATCCCAATG 


TTGAGCCACA TTGCAAATTG TTCGAATTTT TATGGCGCGA 


TACCCAGTTG 


1080 


GGAAGTCGTA 


AGCCTGATGA TTTGCCACTT TCCAGGTACT CAGGATCGCC 


TTTTGGATGG 


1140 


ATGATTGCCC 


GTACCGGATG GGGTCCGGAA AGTGTGATTG CAGAGATGAA 


AGTCAACGAA 


1200 
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TATTCCTTTC TTAACCATCA GCATCAGGAT GCAGGAGCCT TCCAGATCTA TTACAAAGGC 1260 

CCGCTGGCCA TAGATGCAGG CTCGTATACA GGTTCTTCAG GAGGTTATAA CAGTCCGCAC 1320 

AACAAGAACT TTTTTAAGCG GACTATTGCA CACAATAGCT TGCTGATTTA CGATCCTAAA 1380 

GAAACTTTCA GTTCGTCGGG ATATGGTGGA AGTGACCATA CCGATTTTGC TGCCAACGAT 1440 

GGTGGTCAGC GGCTGCCCGG AAAAGGTTGG ATTGCACCCC GCGACCTTAA AGAAATGCTG 1500 

GCAGGCGATT TCAGGACCGG CAAAATTCTT GCCCAGGGCT TTGGTCCGGA TAACCAAACC 1560 

CCTGATTATA CTTATCTGAA AGGAGACATT ACAGCAGCTT ATTCGGCAAA AGTGAAGGAA 1620 

GTAAAACGTT CATTTCTATT CCTGAACCTT AAGGATGCCA AAGTTCCGGC AGCGATGATC 1680 

GTTTTTGACA AGGTAGTTGC TTCCAATCCT GATTTTAAGA AGTTCTGGTT GTTGCACAGT 1740 

ATTGAGCAGC CTGAAATAAA GGGGAATCAG ATTACCATAA AACGTACAAA AAACGGTGAT 1800 

AGTGGGATGT TGGTGAATAC GGCTTTGCTG CCGGATGCGG CCAATTCAAA CATTACCTCC 1860 

ATTGGCGGCA AGGGCAAAGA CTTCTGGGTG TTTGGTACCA ATTATACCAA TGATCCTAAA 1920 

CCGGGCACGG ATGAAGCATT GGAACGTGGA GAATGGCGTG TGGAAATCAC TCCAAAAAAG 1980 

GCAGCAGCCG AAGATTACTA CCTGAATGTG ATACAGATTG CCGACAATAC ACAGCAAAAA 2040 

TTACACGAGG TGAAGCGTAT TGACGGTGAC AAGGTTGTTG GTGTGCAGCT TGCTGACAGG 2100 

ATAGTTACTT TTAGCAAAAC TTCAGAAACT GTTGATCGTC CCTTTGGCTT TTCCGTTGTT 2160 

GGTAAAGGAA CATTCAAATT TGTGATGACC GATCTTTTAG CGGGTACCTG GCAGGTGCTG 2220 

AAAGACGGAA AAATACTTTA TCCTGCGCTT TCTGCAAAAG GTGATGATGG ACCCCTTTAT 2280 

TTTGAAGGAA CTGAAGGAAC CTACCGTTTT TTGAGATAA 2319 

FIG.4B 
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ATGACTACGA AAATTTTTAA AAGGATCATT GTATTTGCTG TAATTGCCCT 


50 


ATCGTCGGGA AATATACTTG CACAAAGCTC TTCCATTACC AGGAAAGATT 


100 


TTGACCACAT CAACCTTGAG TATTCCGGAC TGGAAAAGGT TAATAAAGCA 


150 


GTTGCTGCCG GCAACTATGA CGATGCGGCC AAAGCATTAC TGGCATACTA 


200 


CAGGGAAAAA AGTAAGGCCA GGGAACCTGA TTTCAGTAAT GCAGAAAAGC 


250 


CTGCCGATAT ACGCCAGCCC ATAGATAAGG TTACGCGTGA AATGGCCGAC 


300 


AAGGCTTTGG TCCACCAGTT TCAACCGCAC AAAGGCTACG GCTATTTTGA 


350 


TTATGGTAAA GACATCAACT GGCAGATGTG GCCGGTAAAA GACAATGAAG 


400 


TACGCTGGCA GTTGCACCGT GTAAAATGGT GGCAGGCTAT GGCCCTGGTT 


450 


TATCACGCTA CGGGCGATGA AAAATATGCA AGAGAATGGG TATATCAGTA 


500 


CAGCGATTGG GCCAGAAAAA ACCCATTGGG CCTGTCGCAG GATAATGATA 


550 


AATTTGTGTG GCGGCCCCTT GAAGTGTCGG ACAGGGTACA AAGTCTTCCC 


finn 


CCAACCTTCA GCTTATTTGT AAACTCGCCA GCCTTTACCC CAGCCTTTTT 




AATGGAATTT TTAAACAGTT ACCACCAACA GGCCGATTAT TTATCTACGC 


7nn 

/ uu 


ATTATGCCGA ACAGGGAAAC CACCGTTTAT TTGAAGCCCA ACGCAACTTG 


750 


TTTGCAGGGG TATCTTTCCC TGAATTTAAA GATTCACCAA GATGRATPPA 


oUU 


AACCGGCATA TCGGTGCTGA ACACCGAGAT CAAAAAACAG GTTTATGCCG 


850 


ATGGGATGCA GTTTGAACTT TCACCAATTT ACCATGTAGC TGCCATCGAT 


900 


ATCTTCTTAA AGGCCTATGG TTCTGCAAAA CGAGTTAACC TTGAAAAAGA 


950 


ATTTCCGCAA TCTTATGTAC AAACTGTAGA AAATATGATT ATGGCGCTGA 


1000 
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TCAGTATTTC ACTGCCAGAT TATAACACCC CTATGTTTGG AGATTCATGG 1050 

ATTACAGATA AAAATTTCAG GATGGCACAG TTTGCCAGCT GGGCCCGGGT 1100 

TTTCCCGGCA AACCAGGCCA TAAAATATTT TGCTACAGAT GGCAAACAAG 1150 

GTAAGGCGCC TAACTTTTTA TCCAAAGCAT TGAGCAATGC AGGCTTTTAT 1200 

ACGTTTAGAA GCGGATGGGA TAAAAATGCA ACCGTTATGG TATTAAAAGC 1250 

CAGTCCTCCC GGAGAATTTC ATGCCCAGCC GGATAACGGG ACTTTTGAAC 1300 

TTTTTATAAA GGGCAGAAAC TTTACCCCAG ACGCCGGGGT ATTTGTGTAT 1350 

AGCGGCGACG AAGCCATCAT GAAACTGCGG AACTGGTACC GTCAAACCCG 1400 

CATACACAGC ACGCTTACAC TCGACAATCA AMTATGGTC ATTACCAAAG 1450 

CCCGGCAAAA CAAATGGGAA ACAGGAAATA ACCTTGATGT GCTTACCTAT 1500 

ACCAACCCAA GCTATCCGAA TCTGGACCAT CAGCGCAGTG TACTTTTCAT 1550 

CAACAAAAAA TACTTTCTGG TCATCGATAG GGCAATAGGC GAAGCTACCG 1600 

GAAACCTGGG CGTACACTGG CAGCTTAAAG AAGACAGCAA CCCTGTTTTC 1650 

GATAAGACAA AGAACCGGGT TTACACCACT TACAGAGATG GTAACAACCT 1700 

GATGATCCAA TCGTTGAATG CGGACAGGAC CAGCCTCAAT GAAGAAGAAG 1750 

GAAAGGTATC TTATGTTTAC AATAAGGAGC TGAAAAGACC TGCTTTCGTA 1800 

TTTGAAAAGC CTAAAAAGAA TGCCGGCACA CAAAATTTTG TCAGTATAGT 1850 

TTATCCATAC GACGGCCAGA AGGCTCCAGA GATCAGCATA CGGGAAAACA 1900 

AGGGCAATGA TTTTGAGAAA GGCAAGCTTA ATCTAACCCT TACCATTAAC 1950 

GGAAAACAAC AGCTTGTGTT GGTTCCTTAG 1980 
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MTTKIFKRI I VFAV1ALSSG NILAQ SSSIT RKDFDHINLE YSGLEKVNKA VAAGNYDDAA 
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